2009/5/29 Shedokan <[email protected]>:
>
> TO make things faster I made two Classes:
> this one for the basic file or folder info:
> class Object(db.Model):
> name = db.StringProperty(multiline=False)
> path = db.StringProperty(multiline=False)
> type = db.StringProperty(multiline=False)
> info = db.StringProperty(multiline=False)
>
> created = db.DateTimeProperty(auto_now_add=True)
> changed = db.DateTimeProperty(auto_now_add=True)
>
>
> and this one for the content of the file:
> class ObjectContent(db.Model):
> name = db.StringProperty(multiline=False)
> path = db.StringProperty(multiline=False)
>
> contents = db.BlobProperty()
>
> That way when I'm getting only the files info i don't have to get all
> of it's content.
> And I am getting files like this:
> db.GqlQuery('SELECT * FROM Object WHERE name= :1 AND path= :
> 2',name,path)
This makes use of a composite index when you don't need to. Index
access adds around 100ms extra to each Datastore access (see [0] and
[1] below). Instead of using an index on (path, name), you can use a
key_name composed of the path and name. Something like:
key_name = 'X' + path + name
object_entity = Object(key_name=key_name, ...)
content_entity = ObjectContent(key_name=key_name, contents=contents)
db.put([ object_entity, content_entity ])
Then to query:
entity = ObjectContent.get_by_key_name(key_name)
Note prefixing the key_name with an 'X', to avoid an error if the path
starts with a number. You should also ensure that no distinct
combination of (path, name) will ever lead to the same key_name. If it
can in your application, separate the path and name with some
character that never appears in a path. This is to prevent ambiguous
key_names from being generated, e.g.:
path, name = '/my/site/', 'foo'
bad_key_name = '/my/site/foo' <- ambiguous
better_key_name = '/my/site/|foo'
path, name = '/my/', 'site/foo'
bad_key_name = '/my/site/foo' <- ambiguous
better_key_name = '/my/|site/foo'
You can save yet more time by fetching the Object and the
ObjectContent simultaneously:
object_entity, content_entity = db.get([
db.Key.from_path('Object', key_name),
db.Key.from_path('ObjectContent, key_name)
])
[0]
http://code.google.com/status/appengine/detail/datastore/2009/05/23#ae-trust-detail-datastore-get-latency
[1]
http://code.google.com/status/appengine/detail/datastore/2009/05/23#ae-trust-detail-datastore-query-latency
>
> and I list all files in a folder like this:
> db.GqlQuery('SELECT * FROM Object WHERE path= :1',path)
>
> if only I could select parts of the file and not all of the info like
> SQL:
> SELECT name, PATH from ...
>
>
> On 29 מאי, 17:19, David Wilson <[email protected]> wrote:
>> Hey Shedokan,
>>
>> Are you fetching your files from Datastore in a batch, or one at a time?
>>
>> data = []
>> for filename in ['a', 'b', 'c']:
>> data.append(SomeModel.get_by_key_name(filename))
>>
>> Is significantly slower than:
>>
>> keys = [ db.Key.from_path('SomeModel', fn) for fn in [ 'a', 'b', 'c' ] ]
>> data = db.get(keys)
>>
>> 2009/5/29 Shedokan <[email protected]>:
>>
>>
>>
>>
>>
>> > Thanks, I am worried because I am trying to optimize my app to be
>> > almost as fast as the php version.
>>
>> > usualy it takes 250ms for an ajax request(firebug) in the php version
>> > and 500ms in the python version so python is two times slower than the
>> > php version.
>> > but I guess it's because I have to store the files in the datastore
>> > and not in real directories.
>>
>> > well thanks anyway.
>>
>> > On 29 מאי, 04:32, David Wilson <[email protected]> wrote:
>> >> Just assume that any string/list/hash/integer-related operations in
>> >> Python are likely faster than you'll ever need them to be. The
>> >> overhead for buffering the response is going to be tiny regardless of
>> >> your application, since at most you're only talking about handling
>> >> strings of up to 10mb (which is the request size limit).
>>
>> >> If there is anything with AppEngine you need to be careful of, it is
>> >> use of Datastore, where reading/writing large numbers of entities will
>> >> cost a lot of performance. Reducing your Datastore use by a single
>> >> db.get() is equal to thousands of calls to self.response.out.write()
>>
>> >> $ python /usr/lib/python2.5/timeit.py -v -s 'from cStringIO import
>> >> StringIO; out = StringIO()' 'out.write("123")'
>> >> 10000 loops -> 0.00373 secs
>> >> 100000 loops -> 0.0383 secs
>> >> 1000000 loops -> 0.365 secs
>> >> raw times: 0.358 0.358 0.357
>> >> 1000000 loops, best of 3: 0.357 usec per loop
>>
>> >> $ ae
>> >> Python 2.5.1 (r251:54863, Feb 6 2009, 19:02:12)
>> >> [GCC 4.0.1 (Apple Inc. build 5465)] on darwin
>> >> Type "help", "copyright", "credits" or "license" for more information.
>> >> (AppEngineShell)>>> import time
>> >> >>> t1 = time.time() ; db.get(db.Key.from_path('Foo', 1234)) ; print
>> >> >>> (time.time()-t1)*1000
>>
>> >> 12.0000839233
>>
>> >> David.
>>
>> >> 2009/5/29 Shedokan <[email protected]>:
>>
>> >> > Thanks, but does self.response.out affects speed very much?
>> >> > I couldn't benchmark it, strange...
>>
>> >> > On 28 מאי, 22:25, David Wilson <[email protected]> wrote:
>> >> >> Using self.response.out will also delay sending your entire response
>> >> >> until it is sure to succeed.
>>
>> >> >> If you start generating output using 'print', and then e.g. a
>> >> >> Datastore request times out, or a bug in your code is triggered, you
>> >> >> have no chance to display a friendly error message. Instead the user
>> >> >> will get a half-rendered page with a stack trace embedded in it, or
>> >> >> worse.
>>
>> >> >> David.
>>
>> >> >> 2009/5/28 Shedokan <[email protected]>:
>>
>> >> >> > so I can't print binary data like Images?
>>
>> >> >> > On 28 מאי, 21:03, 风笑雪 <[email protected]> wrote:
>> >> >> >> Print is also OK, but you need handle header by yourself, and it
>> >> >> >> can only
>> >> >> >> output
>> >> >> >> text.http://code.google.com/intl/en/appengine/docs/python/gettingstarted/h...
>>
>> >> >> >> print 'Content-Type: text/plain'
>> >> >> >> print ''
>> >> >> >> print 'Hello, world!'
>>
>> >> >> >> 2009/5/29 Shedokan <[email protected]>
>>
>> >> >> >> > I am wondering why should I use self.response.out.write and not
>> >> >> >> > print
>> >> >> >> > everything.
>>
>> >> >> >> > because I am making this app where I have to output from a lot
>> >> >> >> > ofdifferent functions and I am passing the object 'self'
>> >> >> >> > everywhere.
>>
>> >> >> >> > thanks.
>>
>> >> >> --
>> >> >> It is better to be wrong than to be vague.
>> >> >> — Freeman Dyson
>>
>> >> --
>> >> It is better to be wrong than to be vague.
>> >> — Freeman Dyson
>>
>> --
>> It is better to be wrong than to be vague.
>> — Freeman Dyson
> >
>
--
It is better to be wrong than to be vague.
— Freeman Dyson
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---