Yesterday, the App Engine team hosted another block of its bimonthly
IRC office hours. A transcript of the session and a summary of the
topics covered is provided below. The next session will take place on
Wednesday, June 17th from 9:00-10:00 a.m. PDT in the #appengine
channel on irc.freenode.net.

------------------------------------------------------------------------------------------

- We may add support for additional payment platforms/systems down the
road but other projects are a higher priority at the moment. If you
want to see support added for a particular payment system, please file
a new request in the issue tracker or star an existing request. [7:02,
7:07]

- Some users are seeing 500 errors returned for a small percentage of
requests, especially when under heavy load or after deployment. These
are not reported in the error logs but may appear in the request-only
logs. If these errors re-appear, please check your request logs and
file a new report in the issue tracker with any relevant details such
as your app's load at the time and the frequency of the error. A
screenshot of the error page might help as well. [7:04, 7:16, 7:21]

- Q: Any plans to publish an interface/documentation for the
development server, e.g. allowing for the Jetty server to be started
and stopped from within another programming environment? A: There are
plans to open source the Java SDK soon which may help with this. [7:09
- 7:18, 7:25 - 7:33]

- Q: Is this normal that posting (saving or updating) 20 root entities
with no children takes about 2 seconds? A: Instead of adding entities
serially, add them in a batch instead, reducing the number of round
trips and shaving off at least half of the time. [7:19 - 7:22]

- Q: How can one  estimate storage space requirements? A: It's not
straightforward right now -- in addition to raw data, metadata and
indexes have to be stored which adds to the total storage needed by
your application. With the latest release, you can disable indexes for
single properties that you don't plan to query, which will save
storage and make writes slightly faster. We plan to provide more
specific information about how data storage size is calculated going
forward. [7:33, 7:42 - 7:45]

- Discussion on how to best go about querying for points within a
bounding box in a geo-based application [7:36, 7:39 - 7:42]

- SQL-like LIKE behavior can be approximated in App Engine by
filtering on string prefix; see
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html#Introducing_Indexes
[7:45, 7:48]

- Session videos for last week's Google I/O will be posted shortly.
Included among these is a presentation on App Engine's new task queue
system which will be rolled out in the not-too-distant future. We'll
post to the blog when these are available. [7:59]

------------------------------------------------------------------------------------------

[7:00pm] Jason_Google: Hi Everyone. Welcome to the latest App Engine
Chat Time! I and a few other Googlers will be here for the next hour
to chat about all things App Engine. Fire away!
[7:00pm] knowtheory: hope their having a good time at the conf though
[7:00pm] dan_google: You can still ask Java questions, but Toby, Don
and Max won't be here tonight because they're at JavaOne.
[7:01pm] alexrudnick: dan_google: The Eclipse plugin team is
represented!
[7:01pm] dan_google: yay
[7:01pm] knowtheory: yeh, well i'm a rubyist primarily, and only
getting into the java stuff secondarily via JRuby
[7:01pm] maxoizo: Hi google team. I have a question not only to the
technical department, but to the financial department too. Please take
a look at this link: 
http://code.google.com/p/googleappengine/issues/detail?id=1650,
and say, is it possible in the near future or not?
[7:01pm] knowtheory: 404?
[7:02pm] knowtheory: oh comma fail
[7:02pm] Jason_Google: maxoizo: Probably not the near future, although
we do hope to expand the number of supported payment systems
eventually.
[7:02pm] dw: my only question would be about the status of xmpp
[7:03pm] Jason_Google: dw: It's coming along nicely. No ETA just yet,
however.
[7:03pm] dw: excellent. thanks
[7:03pm] Jason_Google: No problem.
[7:04pm] bthomson: sometimes i get a 500 error from infrastructure
that is not reported in the console under heavy load
[7:04pm] bthomson: is it caused by not enough interpreters spun up or
something?
[7:05pm] maxoizo: Jason_Google: very very bad. And one more question -
we want to transfer our big project to AppEngine, which possible
[7:06pm] Jason_Google: bthomson: How often do you get these 500s? Do
you see them only when your application hasn't gotten many requests
for a period or randomly?
[7:06pm] lidaobing: [java] the cron system is very unstable, for
example: issue 1333 and issue 1252, can we have a better cron system
soon?
[7:07pm] Jason_Google: maxoizo: Right now, Google Checkout is the only
supported payment platform, unfortunately. Please file an issue in the
issue in the tracker for your payment platform of choice:
http://code.google.com/p/googleappengine/issues/list
[7:08pm] bthomson: Jason_Google: i don't have any percentage for you,
but during load testing (ie, many requests are coming in) some small %
of requests come back with this 500 error which is different from a
python 500 error caused by the application
[7:08pm] Jason_Google: lidaobing: These are both known bugs, yes. We
don't have anyone here from the cron team but I know they're working
on getting these issues addressed.
[7:09pm] lidaobing: Jason_Google, thanks
[7:09pm] knowtheory: [java] are you guys ever gonna publish the
details/documentation for the dev server and/or app config stuff?
[7:09pm] bthomson: it appears to be just one request that gets the
error, not a cluster of requests in a particular timeframe
[7:09pm] knowtheory: (i should watch that, i haven't checked recently
if there have been doc updates)
[7:09pm] knowtheory: doesn't look like it though
[7:10pm] maxoizo: Jason_Google: we want to transfer our big project to
AppEngine, witch possible will require more than 500 requests per
second. Now project have cluster with 15 servers (tematics: ads - like
AdSence/AdWords). We have some questions not only to technical side,
but also TOS. Can i mail you in near time, cos i wantn't talk about
this project in icq
[7:10pm] Jason_Google: bthomson: Interesting. And you don't see any
log messages indicating a datastore timeout or other error?
[7:10pm] maxoizo: * in irc ^)
[7:10pm] alexrudnick: knowtheory: What sorts of details would you
like? Are there specific parts of the doc you'd like expanded?
[7:11pm] knowtheory: i've been manually decompiling and digging
through the dev server and appconfig code
[7:11pm] knowtheory: I've been building a lib that provides a
rubyesque interface to all that gear via JRuby
[7:11pm] bthomson: Jason_Google, no, it's definitely not reported in
the error console... the text screen is different from the text of a
normal 500 (syntax error or Timeout or whatever) with tracebacks
disabled as well
[7:12pm] nickjohnson: knowtheory: Any reason you can't use the Python
appcfg etc as reference, in that case?
[7:12pm] lidaobing: Jason_Google, bthomson I also experience this
problem when two request send at the same time
[7:12pm] lidaobing: one of them will get a 500 very quickly
[7:13pm] knowtheory: nickjohnson: i mean the actual details and
classes of how they're structured
[7:13pm] knowtheory: so the python docs probably serve as a good guide
for the user facing stuff
[7:13pm] nickjohnson: knowtheory: Are you trying to duplicate the
tools in JRuby, or just ue them?
[7:13pm] nickjohnson: er, use
[7:13pm] bthomson: it's not a huge problem for ajax apps because I can
just trap the 500 and rerequest, but it's kindof annoying
[7:13pm] bthomson: lidaobing, glad to see i didn't imagine it
[7:13pm] knowtheory: but i've actually got a setup which invokes and
starts the Jetty server up via JRuby
[7:14pm] knowtheory: rather than just shelling out to the scripts that
are provided w/ the appengine java sdk
[7:14pm] nickjohnson: bthomson: Have you tried checking with filter
level "requests only"? It's possible they show up as a 500 with no
stack trace, in which case they won't be logged as errors
[7:14pm] nickjohnson: knowtheory: So what exactly are you trying to
accomplish that requires more docs or source?
[7:14pm] dan_google: knowtheory: If you're asking for the source of
the SDK, we plan to release that soon.
[7:14pm] dan_google: (the Java SDK I mean)
[7:14pm] knowtheory: dan_google: yep  that's what i wanted to know
[7:14pm] knowtheory: cool
[7:14pm] bthomson: nickjohnson, thank you I will check, I did not know
that condition was possible
[7:15pm] knowtheory: nickjohnson: so the goal is being able to do
stuff with the server besides just run it.
[7:16pm] nickjohnson: knowtheory: Can you be more specific than
'stuff'?
[7:16pm] knowtheory: one really rudimentary example is being able to
stat the file system locally to keep track of changes to a ruby app,
and then restart the server to pick up on the changes
[7:16pm] nickjohnson: Ah\
[7:16pm] dw: bthomson: up until a few months ago, i'd regularly see
500s around the time of a new deployment
[7:16pm] knowtheory: (so yes, i can be more specific  )
[7:16pm] nickjohnson: I'm not sure I understand why that requires
insight into the server, though
[7:16pm] dw: the 'no logs' kind which i think you're referring to
[7:16pm] nickjohnson: Do you want to extend the server itself to do
that?
[7:17pm] knowtheory: nickjohnson: well no i can wrap the behavior
around the server
[7:17pm] bthomson: dw: it's kinda scary because you don't know how
many users are getting errors and there's no record
[7:17pm] knowtheory: this was more a question for future changes to
the servers and other config gear
[7:17pm] knowtheory: so that i can keep track of the changes more
easily between sdk versions
[7:18pm] nickjohnson: bthomson: That sort of behaviour is (obviously)
a bug. Any information you can give us to help reproduce it is
appreciated.
[7:18pm] knowtheory: i can do the file system checking in ruby (and am
happy to)
[7:18pm] Jason_Google: bthomson: Did you check the request logs?
[7:18pm] nickjohnson: knowtheory: Fair enough. As dan_google said, we
do plan to release the source.
[7:18pm] knowtheory: cool
[7:19pm] maximity: Is this normal that posting (saving or updating) 20
root entities with no children takes about 2 seconds? Does this depend
on number of the entity's properties?
[7:19pm] bthomson: i will have to do some work to check the request
logs because the application spews a ton of logging data and the
problem only shows up under load
[7:20pm] bthomson: if I make a thread with "bug report" you guys will
see it right?
[7:20pm] nickjohnson: maximity: Are you putting them serially, or in
one batch?
[7:20pm] Jason_Google: bthomson: Yes, definitely.
[7:20pm] nickjohnson: The round-trip time is a substantial component
of how long it takes to perform a datastore operation
[7:20pm] maximity: ds.put(entity) in the for loop
[7:20pm] nickjohnson: bthomson: You can also file a bug.
[7:21pm] nickjohnson: maximity: Instead, accumulate entities to be
updated and do a db.put() on the whole list at the end of the loop
[7:21pm] nickjohnson: 1 round trip instead of 20.
[7:21pm] bthomson: haha, my bad
[7:21pm] maximity: thanks
[7:21pm] Jason_Google: bthomson: Like Nick said, any info. you can
provide in that report will help. If you only see it under heavy load
(provide the rough load estimate), whether you only see if after you
deploy. A screenshot of the error screen would help too even if it is
generic.
[7:21pm] dan_google: maximity: A batch put not only does only 1 round
trip, but can update the different entity groups in parallel.
[7:21pm] nickjohnson: Which will cut off about 50*19=0.95 seconds from
your execution time
[7:22pm] dan_google: maximity: If multiple entities in the same group
and put in a batch, it'll be one save (one change to the entity
group).
[7:22pm] bthomson: Jason_Google: thanks, sure np, I will post it as a
new bug if I see it happening during next load test
[7:22pm] dw: do unsaved entities with no parent set go to the same
entity group during a batch put? or are they considered individually
[7:23pm] dw: oh, stupid question i guess. the group depends on their
key
[7:23pm] nickjohnson: I imagine the source IP of the 500 and a
timestamp would help, too - submitted privately if you're concerned
about privacy
[7:23pm] Jason_Google: dw: I believe they're put as root entities.
[7:23pm] dan_google: dw: unsaved entities with no parent set are each
created in a new entity group.  However, done in a batch the creates
can occur in parallel.
[7:23pm] nickjohnson: dw: Any entity with no parent is in its own
entity group
[7:23pm] nickjohnson: Heh.
[7:25pm] knowtheory: nickjohnson: i just realized i did a terrible job
of answering your question
[7:25pm] knowtheory: So the goal is to be able to send signals to the
server from within a jruby script
[7:26pm] knowtheory: so that i can control stuff there, which requires
some documentation/reverse engineering of how the server actuallyw
orks
[7:26pm] knowtheory: and what the methods to start/restart/stop it are
and the like
[7:26pm] nickjohnson: 'signals' other than what Java apps can already
send to the server?
[7:26pm] nickjohnson: Oh, right, you mean from an _external_ jruby
script
[7:26pm] knowtheory: well i mean, i pull the relevant jars and classes
in to muck about with
[7:26pm] nickjohnson: I think the standard sysv operations are your
best bet there - sigterm, etc
[7:27pm] knowtheory: but the interface provided by default is just via
the shell script w/ the sdk
[7:27pm] nickjohnson: But I can certainly understand the goal of
extending the server to natively support monitoring the freshness of
your ruby code
[7:27pm] knowtheory: cool
[7:28pm] knowtheory: yeah and i don't know, if there's ever additional
stuff that gets included with the server, it'd be nice to be able to
interrogate some of that stuff from ruby potentially (but that's again
fairly vague at this point  )
[7:28pm] knowtheory: I'm still exploring the world of java libraries,
so i'm not an expert on what i can mix and match for interesting
purposes
[7:29pm] knowtheory: But the key really was the fact that the Jetty
server and the classes provided with the SDK aren't something that are
easily duplicated in ruby
[7:29pm] knowtheory: partially just because of the fact that AppEngine
behaves differently from the expectation of other ruby frameworks in a
variety of ways
[7:30pm] knowtheory: so trying to map the rubyist way of doing things
to the appengine way of doing things just requires reading on my part
and the like
[7:30pm] knowtheory: so anything you guys can do to make that reading
easier is much appreciated!
[7:33pm] knowtheory: okay guys, i'm being invited to go crash some
castles  thanks for the chat!
[7:33pm] maximity: How can one  estimate the storage space
requirements?
[7:33pm] maximity: I recently uploaded a data from the text file which
had original size about 1.5 MB (text ascii with delimiters) and the
amount of stored data increased by roughly 70 MB
[7:33pm] maximity: The file had about 7500 records converted stored in
a single root Entity/Model with 7500 instances
[7:34pm] bthomson: i think you can turn off single property indexes
now, that might help
[7:35pm] maximity: I have not configured any indexes yet
[7:35pm] nickjohnson: maximity: Did your app have other activity
during that period? Bear in mind that datastore usage is only updated
authoritatively once a day
[7:35pm] maximity: 99% not
[7:35pm] Jason_Google: knowtheory: Have fun.
[7:35pm] maximity: no requests
[7:35pm] nickjohnson: So what you saw beforehand could be a (low)
estimate, and what you saw after could be the updated figure for the
entire day
[7:35pm] maximity: I checked logs
[7:36pm] maximity: hmm, it is not likely if I have not seen any
requests in logs
[7:36pm] cwvh: I'm currently looking at porting a PostGIS-based map
app to GAE and currently hung up on what I'm going to do about
querying for points within a bounding box. I've glossed over some
clever tricks to get around the lack of GIS operators such as
geohashing and list properties, but do any of the GAE gurus have any
advice?
[7:36pm] maximity: is there any way to check a size allocated to
Entity?
[7:36pm] Jason_Google: maximity: Single-property indexes are
automatically created. You can disable these for individual properties
that you don't plan to query to save some space.
[7:36pm] nickjohnson: Not currently, no
[7:36pm] dw: maximity: how many fields did each 'line' have, and did
you have "indexed=False" in your model defs?
[7:37pm] maximity: I used defaults
[7:37pm] dw: (didn't i read somewhere we pay quota for single prop
indexes?)
[7:37pm] maximity: I would say about 40 fields
[7:37pm] bthomson: ^^^ and also does disabling single prop indexes
make puts faster?
[7:37pm] maximity: most of them Double
[7:38pm] dw: bthomson: it apparently does, i asked this a while back
[7:38pm] bthomson: wow, that could be very useful
[7:38pm] dw: test.. it might only be marginal
[7:39pm] Jason_Google: cwvh: I've seen some impressive geo-based
applications recently (I think one is being turned into a sample) but
it's using techniques like geohashing. Because you can't have
inequality filters on more than one property, that makes geo data
somewhat hard to work with in GAE, but geohashing does a decent job.
[7:39pm] nickjohnson: maximity: 40 fields is a reasonable number. The
length of your field names can also have an impact, actually.
[7:39pm] nickjohnson: cwvh: There are various GIS options available
for App Engine currently, but none are as mature as something like
PostGIS, currently
[7:40pm] Jason_Google: bthomson: Yes, writes will be faster since not
as many indexes need to be updated.
[7:40pm] nickjohnson: Geohashing/Hilbert curve based approaches suit
bigtable much better than tree based indexing, because they avoid
seeks
[7:40pm] nickjohnson: Or in the case of Bigtable, avoid multiple
queries/lookups
[7:40pm] cwvh: Jason_Google: I've been really intrigued by a technique
of using successively less accurate lat/lon pairs in a property list
and then using membership testing as means of quick (and not
particularly accurate) culling.. is membership testing via property
lists considered "dangerously" quick?
[7:40pm] bthomson: thanks Jason_Google
[7:41pm] cwvh: e.g., [(100.0135, x), (100.013, x), (100.01, x)]
[7:41pm] nickjohnson: cwvh: That's one extant approach - Brett's
Geobox library does this
[7:41pm] nickjohnson: The other approach is to store a single number
encoding both lat and long (hilbert curve / geohashing) and use a
range query to retrieve the contents of a bounding box
[7:42pm] Jason_Google: cwvh: It depends on how many items are in the
list. If you have any other list properties in your kind, particularly
ones that you need to use in your queries, indexes could start to be a
problem.
[7:42pm] maximity: as I said most of the fields became numbers stord
as Double, a few Stings, but nothing crazy about 200 charactes per
record i.e. split across these 40 properties
[7:42pm] bthomson: if maximity had 40 properties and there are 40
indexes, then increase of size 70x seems not unreasonable
[7:42pm] nickjohnson: The former uses only equality queries, but
requires more index entries; the latter only one index entry, but
requires you to use your one inequality filter
[7:43pm] nickjohnson: maximity: To store a single property in an
entity requires the length of the property (8 bytes in the case of a
double), plus the length of its name (for every entity), plus some
bytes of overhead
[7:43pm] dw: im a little surprised we're charged for field name
storage on a per entity basis?
[7:43pm] dw: would have thought a number was used internally, or
something
[7:43pm] Jason_Google: maximity: We are planning to add some more
documentation on how to estimate datastore storage requirements since
I think a lot of developers would appreciate this. This is on my
plate, actually.
[7:44pm] nickjohnson: dw: The datastore is schemaless; there's no
other way to allow arbitrary field names for arbitrary columns
[7:44pm] nickjohnson: The lower level API works more or less like a
dictionary
[7:44pm] nickjohnson: (For each entity, that is)
[7:45pm] maximity: thanks guys it will help us a lot
[7:45pm] maximity: Can you offer any advice for implementing a query
similar to SQL  -> LIKE ‘A%’ incompatible encoding
[7:45pm] maxoizo: Some stupid question: your roadmap to be fulfill by
the end of June? Or is it a rough time?
[7:45pm] dw: nickjohnson: it makes sense, i guess. i assumed that /
somewhere/ where was an authoritative list of whats in use for an app
[7:45pm] nickjohnson: maximity: See the docs:
http://code.google.com/appengine/docs/python/datastore/queriesandindexes.html#Introducing_Indexes
[7:45pm] nickjohnson: The note in that section describes how to
implement a prefix query with inequalities
[7:46pm] Jason_Google: maxoizo: It's rough. There might be a few
things that slip a bit, but everything on that list is being actively
worked on now.
[7:46pm] nickjohnson: dw: The protocol buffers we use for data storage
are open-source - you can see exactly how we store it.
[7:46pm] maxoizo: Jason_Google: thanks
[7:47pm] dw: i assumed you'd be assigning field tags for each unique
name, kind of like an atom table (thats perhaps a windows-specific
term)
[7:47pm] maximity: I think I read this doc before, I believe App
Engine supports only basic operators unless I missed somehting
[7:47pm] maximity: The filter operator can be any of the following:
[7:47pm] maximity: < less than
[7:47pm] maximity: <= less than or equal to
[7:47pm] maximity: = equal to
[7:47pm] maximity: > greater than
[7:48pm] maximity: >= greater than or equal to
[7:48pm] maximity: != not equal to
[7:48pm] maximity: IN equal to any of the values in the provided list
[7:48pm] nickjohnson: Please don't paste large blocks of text in here
[7:48pm] maximity: sorry
[7:48pm] nickjohnson: There's a 'Tip' section at the end of the
section I linked you to, that describes how to implement a filter for
a string prefix using > and <
[7:49pm] nickjohnson: By making use of the order in which strings are
sorted
[7:55pm] dan_google: Anyone here playing with Wave?
[7:55pm] dw: give us xmpp and we might
[7:55pm] nickjohnson: dw: You can write Wave bots without needing XMPP
[7:55pm] dan_google:   An XMPP->Wave robot?
[7:55pm] dw: i'd love an account on the demo system, but i guess
they're as scarse as hen's teeth
[7:55pm] maximity: hmm, I have not received account info yet
[7:55pm] nickjohnson: dan_google: I think he's referring to the fact
that Wave uses XMPP
[7:56pm] dan_google: nickjohnson: Wave uses XMPP?
[7:56pm] dw: *scarce
[7:56pm] cwvh: have the sandbox accounts started to trickle out?
[7:56pm] dw: server<->server is based on XMPP, AFAIK
[7:56pm] dan_google: To I/O attendees for now, yes
[7:56pm] nickjohnson: dw: Right
[7:56pm] dan_google: oh that bit
[7:56pm] dan_google: right
[7:56pm] Jason_Google: dw: Google I/O attendees will be the first to
get accounts, but many of them are still waiting for credentials. I
think it will be a bit longer before the external developer community
at large gets an account, but we'll see.
[7:56pm] nickjohnson: Running out of official dev chat time - if you
have a question, ask now!
[7:56pm] maximity: hmm, I went to I/O and sent request but have not
received anything back
[7:56pm] dw: with some very strange looking non-uri identifiers, but
it's a preview, and i see sam ruby noticed this fact as well
[7:57pm] Jason_Google: maximity: In that case, you should hear back
soon.
[7:57pm] Jason_Google: They're still working on it, AFAIK.
[7:57pm] dan_google: dw: Wave<->App Engine already works, which is why
I mentioned it here.
[7:58pm] maxoizo: To java team: Will we expect in the near reliase to
support backgroundtask? Or only after? I know that python will support
this on next+ week
[7:58pm] dan_google: maxoizo: Do you mean the Task Queue API?
[7:58pm] dw: ah.. very quick question.. i saw mention of 'better than
cron' background tasks. does this mean an appengine mapreduce-alike?
[7:59pm] dan_google: dw: Not quite.  We're about to launch an
(experimental) task queue service.
[7:59pm] dw: aha. thanks.
[7:59pm] nickjohnson: dw: Look out for the video of the I/O talk on
said queues, out soon.
[7:59pm] bthomson: sounds exciting
[7:59pm] Jason_Google: By end of week. There was a presentation on it.
[7:59pm] Jason_Google: bthomson: Oh, it is.
[8:00pm] maxoizo: dan_google: yes
[8:00pm] Jason_Google: OK, we've reached the end of Chat Time. Thanks
for joining in! The next one will be in two weeks, June 17th, 9-10
a.m. PDT.
[8:01pm] cwvh: thanks for the Q&A session guys ~
[8:01pm] bthomson: thanks for chat!
[8:01pm] dw: thanks all
[8:01pm] Jason_Google: You're very welcome. Have a very good evening
(or morning depending).
[8:01pm] maxoizo: Thanks appengine team!
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to