[google-appengine] Re: Datastore: how to design for huge time-series data

2017-02-14 Thread Evan Jones
I would recommend using the BigQuery streaming API. We do a heck of a lot of that at Bluecore and it works well. Depending on how your data arrives, you may want to use a Task Queue or similar to collect lots of rows together to be able to insert batches into BigQuery, which will be more

[google-appengine] Re: Datastore: how to design for huge time-series data

2017-02-14 Thread 'Nick (Cloud Platform Support)' via Google App Engine
Hey Etienne, You've correctly enumerated a few ways to transfer data from Cloud SQL to BigQuery: * export to CSV and load the CSV into BigQuery * retrieve the data with an app and stream it into the BigQuery API * export to Datastore and then import to BigQuery There are also other ways,

[google-appengine] Re: Datastore: how to design for huge time-series data

2017-02-14 Thread Etienne B. Roesch
Hi, Sorry for the repeat, but I am trying to wrap my head around the GAE-osphere and I am getting a bit confused; I need to store and retrieve/analyse timeseries data, of varying sizes and resolutions; at the moment, the data is received and stored on GAE through to Google Cloud SQL (python).

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-03 Thread Vinny P
On Sun, Feb 1, 2015 at 4:00 PM, Shailendra Singh srj0...@gmail.com wrote: This might be a question out of track. Somehow i figured out how to store multiple values in NDB i.e. using repeated properties. Now my next step is to create google chart. Can some one guide me for the same. I hadn't

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-03 Thread Nickolas Daskalou
+1 for BigQuery if you only need to add records (not edit or delete them). We use BigQuery to store analytic data for FollowUs.com http://followus.com. Best thing about it is that it works as advertised. Biggest downside is that queries can take a few seconds to return with results. If you can

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-01 Thread Shailendra Singh
I was trying to use repeated properties in GAE to store multiple values inside a property with a time stamp just like a time-series database. Once that is done, we can query for last 1 hour or 2 hour or 1 day and similar type. Can you guide me a preferred way as again only 1 mb of repeated

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-01 Thread Rafael
By the way, this isn't a datastore specific problem. Even on mysql, you don't want to be querying millions of rows to draw a simple summary. On Sun, Feb 1, 2015 at 10:09 AM, Rafael mufumb...@gmail.com wrote: To solve that problem you can have DataPoint as a temporary table only. That way,

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-01 Thread Rafael
To solve that problem you can have DataPoint as a temporary table only. That way, every 5 minutes you can run a cron that download all DataPoint and deletes them after you summarize the content in another table. You can summarize on a 5 minute table, then the average from that table goes to the

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-01 Thread Emanuele Ziglioli
What about using BigQuery, anybody has tried for this specific purpose? Inserting data and exporting a whole table is free at this stage. By the way, I've tried a couple of strategies, involving entity groups. I started storing the timestamp as key. That improved things a little bit, in the

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-02-01 Thread Shailendra Singh
This might be a question out of track. Somehow i figured out how to store multiple values in NDB i.e. using repeated properties. Now my next step is to create google chart. Can some one guide me for the same. I hadn't found much tutorials for google Chart + NDB . Google charts have google docs but

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-01-31 Thread timh
Have a look at nimbits it stores time series in appengine datastore. It's written in java, but the data models used should be straightforward to translate into NDB. T On Friday, January 30, 2015 at 4:32:52 AM UTC+8, Shailendra Singh wrote: Hi Rafael It's a old thread, but can you please

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-01-31 Thread gregory nicholas
i've got some code for this from a recent project . hit me up . log individual events, then run map reduce to aggregate into time slices also by field values to create preagrregrated counts . querying is not as nimble as say mongo, so this works, but a few extra steps -- You received this

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2015-01-29 Thread Shailendra Singh
Hi Rafael It's a old thread, but can you please share some information on how you stored different rows for hour, day, week, month, year, etc. You can squeeze a lot of data in 1mb :) in GAE? I an new to GAE and i am trying to store some ts data with respect to a entity in NDB. Thanks On

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-15 Thread Jeff Schnitzer
Keep in mind that this can get very expensive very fast, and on-the-fly aggregation is pretty much unavailable. You might consider running a specialized timeseries db on GCE or some other cloud host. Jeff On Wed, Aug 14, 2013 at 7:51 AM, Martin Trummer martin.trum...@dewesoft.com wrote:

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-15 Thread Vinny P
I'd recommend building a test application to load in a bunch of dummy entries, and seeing what performance you get out of it. From there we can discuss specific optimization strategies and so forth depending on where the bottlenecks turn up. - -Vinny P Technology Media Advisor

[google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-14 Thread Martin Trummer
On Tuesday, 13 August 2013 22:42:25 UTC+2, Jay wrote: In my opinion, your biggest take away from this should be to avoid having a mega entity group and you do this by simply *not* having all the entities in question have the same parent. Or perhaps more pointedly, any parent at all.

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-14 Thread Martin Trummer
okay, so you have 2 entity types TimeSeriesIndex and DataPoint but what about the DataPoint entity - you also have the same problem there, right? all your data ends up in the DataPoint entity - or does your cron-job delete the DataPoints, after generating the TimeSeriesIndex? On Wednesday, 14

[google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-14 Thread timh
If you do not specify an ancestor the entity group of the entity consists of only itself. So if you create 2 million entities with no parent entity then you have 2 million separate entity groups. Which is fine for what you are doing. Any thing else will severely limit write through put.

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-14 Thread Martin Trummer
great - thanks timh that was the point, I was missing! Mit freundlichen GrĂ¼ssen/Kind regards, Martin Trummer __ DI (FH) Martin Trummer Mobile: +43 676 700 47 81 skype:ds.martin.trummer mailto:martin.trum...@dewesoft.com *Attention! New mailaddress ends with .**com

[google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-13 Thread Jay
In my opinion, your biggest take away from this should be to avoid having a mega entity group and you do this by simply *not* having all the entities in question have the same parent. Or perhaps more pointedly, any parent at all. Unless there is a really strong case to put many thousands of

Re: [google-appengine] Re: Datastore: how to design for huge time-series data

2013-08-13 Thread Rafael
i implemented this by having these components: - TimeSeriesIndex - different rows for hour, day, week, month, year, etc. You can squeeze a lot of data in 1mb :) - DataPoint - unprocessed data point data. thousands of rows per minute. - cron that process the datapoints inside the indexes - the ui