I would recommend using the BigQuery streaming API. We do a heck of a lot
of that at Bluecore and it works well. Depending on how your data arrives,
you may want to use a Task Queue or similar to collect lots of rows
together to be able to insert batches into BigQuery, which will be more
Hey Etienne,
You've correctly enumerated a few ways to transfer data from Cloud SQL to
BigQuery:
* export to CSV and load the CSV into BigQuery
* retrieve the data with an app and stream it into the BigQuery API
* export to Datastore and then import to BigQuery
There are also other ways,
Hi,
Sorry for the repeat, but I am trying to wrap my head around the
GAE-osphere and I am getting a bit confused;
I need to store and retrieve/analyse timeseries data, of varying sizes and
resolutions; at the moment, the data is received and stored on GAE through
to Google Cloud SQL (python).
On Sun, Feb 1, 2015 at 4:00 PM, Shailendra Singh srj0...@gmail.com wrote:
This might be a question out of track. Somehow i figured out how to store
multiple values in NDB i.e. using repeated properties. Now my next step is
to create google chart. Can some one guide me for the same. I hadn't
+1 for BigQuery if you only need to add records (not edit or delete them).
We use BigQuery to store analytic data for FollowUs.com
http://followus.com.
Best thing about it is that it works as advertised.
Biggest downside is that queries can take a few seconds to return with
results. If you can
I was trying to use repeated properties in GAE to store multiple values
inside a property with a time stamp just like a time-series database. Once
that is done, we can query for last 1 hour or 2 hour or 1 day and similar
type. Can you guide me a preferred way as again only 1 mb of repeated
By the way, this isn't a datastore specific problem. Even on mysql, you
don't want to be querying millions of rows to draw a simple summary.
On Sun, Feb 1, 2015 at 10:09 AM, Rafael mufumb...@gmail.com wrote:
To solve that problem you can have DataPoint as a temporary table only.
That way,
To solve that problem you can have DataPoint as a temporary table only.
That way, every 5 minutes you can run a cron that download all DataPoint
and deletes them after you summarize the content in another table.
You can summarize on a 5 minute table, then the average from that table
goes to the
What about using BigQuery, anybody has tried for this specific purpose?
Inserting data and exporting a whole table is free at this stage.
By the way, I've tried a couple of strategies, involving entity groups. I
started storing the timestamp as key.
That improved things a little bit, in the
This might be a question out of track. Somehow i figured out how to store
multiple values in NDB i.e. using repeated properties. Now my next step is
to create google chart. Can some one guide me for the same. I hadn't found
much tutorials for google Chart + NDB . Google charts have google docs but
Have a look at nimbits it stores time series in appengine datastore. It's
written in java, but the data models used should be straightforward to
translate into NDB.
T
On Friday, January 30, 2015 at 4:32:52 AM UTC+8, Shailendra Singh wrote:
Hi Rafael
It's a old thread, but can you please
i've got some code for this from a recent project . hit me up .
log individual events, then run map reduce to aggregate into time slices also
by field values to create preagrregrated counts .
querying is not as nimble as say mongo, so this works, but a few extra steps
--
You received this
Hi Rafael
It's a old thread, but can you please share some information on how you
stored different rows for hour, day, week, month, year, etc. You can
squeeze a lot of data in 1mb :) in GAE? I an new to GAE and i am trying to
store some ts data with respect to a entity in NDB.
Thanks
On
Keep in mind that this can get very expensive very fast, and on-the-fly
aggregation is pretty much unavailable. You might consider running a
specialized timeseries db on GCE or some other cloud host.
Jeff
On Wed, Aug 14, 2013 at 7:51 AM, Martin Trummer martin.trum...@dewesoft.com
wrote:
I'd recommend building a test application to load in a bunch of dummy
entries, and seeing what performance you get out of it. From there we can
discuss specific optimization strategies and so forth depending on where
the bottlenecks turn up.
-
-Vinny P
Technology Media Advisor
On Tuesday, 13 August 2013 22:42:25 UTC+2, Jay wrote:
In my opinion, your biggest take away from this should be to avoid having
a mega entity group and you do this by simply *not* having all the
entities in question have the same parent. Or perhaps more pointedly, any
parent at all.
okay, so you have 2 entity types TimeSeriesIndex and DataPoint
but what about the DataPoint entity - you also have the same problem there,
right?
all your data ends up in the DataPoint entity - or does your cron-job
delete the DataPoints, after generating the TimeSeriesIndex?
On Wednesday, 14
If you do not specify an ancestor the entity group of the entity consists
of only itself.
So if you create 2 million entities with no parent entity then you have 2
million separate entity groups.
Which is fine for what you are doing.
Any thing else will severely limit write through put.
great - thanks timh
that was the point, I was missing!
Mit freundlichen GrĂ¼ssen/Kind regards,
Martin Trummer
__
DI (FH) Martin Trummer
Mobile: +43 676 700 47 81
skype:ds.martin.trummer
mailto:martin.trum...@dewesoft.com
*Attention! New mailaddress ends with .**com
In my opinion, your biggest take away from this should be to avoid having a
mega entity group and you do this by simply *not* having all the entities
in question have the same parent. Or perhaps more pointedly, any parent at
all. Unless there is a really strong case to put many thousands of
i implemented this by having these components:
- TimeSeriesIndex - different rows for hour, day, week, month, year, etc.
You can squeeze a lot of data in 1mb :)
- DataPoint - unprocessed data point data. thousands of rows per minute.
- cron that process the datapoints inside the indexes
- the ui
21 matches
Mail list logo