First thing to be aware of.. you cannot have inequality filters on more than
one property.

So criteria as simple as:

 ColumnA > 50 AND ColumnB > 10

Would not work.

To make something like that work, you must figure out if you can have
pre-computed ranges.

Maybe you know that ColumnB will have these ranges:  10-1000, 100-1000,
200-1000, etc

You can easily create a new property for the Model that will set
ColumnB_Range = '500-1000' if ColumnB = 675 (or some value in that range).

Then your criteria becomes:

ColumnA < 50 AND ColumnB_Range = '500-1000'

Sadly, the ranges for ColumnB probably won't be that simple.. and you'll
probably want ranges like 10-1000, 10-900, ..., 20-1000, 20-900, ..., etc..

Which would give you 10,000 (100*100) ranges to precompute and store.. and
each entity will be in at least 100 of the ranges.. so that's 100 properties
to plop onto each entity (whether they are individually defined or just
stuffed into a single ListProperty)..

Now, if you know that the ColumnA property is queried in a simpler manner..
maybe you can focus on pre-computing it.  For example, if you only care
about ColumnA < 50, 100, 150, 200, or 250...  (and you don't care about
ColumnA > anything).. you can just add 5 properties to the model called
something like:  "LT50", "LT100" etc.

You set the property = 1 if ColumnA is less than its value.

Then, you could query like so:

LT50 = 1 AND ColumnB > 10 AND ColumnB < 1000 AND ColumnC = True

You would just need an index defined on LT50, ColumnC and ColumnB.  (and
also an index on: LT100, ColumnC, ColumnB  and LT150, ColumnC, ColumnB..
etc)

Though, reading from a SQL database of 10 million rows using criteria like
you describe should not be too intensive (with the right indexes)..

Were the data types for those rows just Ints and a Boolean?  What indexes
did you have defined on the table?

Depending on the shape of your data.. some compound indexes (or even
clustered index) on ColumnC, ColumnA, ColumB,   or on.. ColumnC, ColumnB,
ColumnA, using appropriate where clauses in the created indexes, should be
as fast as one would ever hope.  (Granted, I know you've represented a
simplified form of your scenario.)

Again, if there was a partitioned range that got queried against (i.e. 10 to
1000, 1000 to 2000, 2000 to 3000, etc).. you could create special indexes
for each partition of that range.. and SQL would return your results very
fast.


On Sat, Nov 27, 2010 at 9:56 PM, smitts <[email protected]> wrote:

> I'm considering using GAE to host a relatively large (10 Million+
> rows, possibly larger) database.  Data from this database is then
> retrieved using filter criteria on multiple columns of the table and
> displayed to the user.  Typically the criteria is (column a < 50) &&
> (10 < column b < 1000) && (column c == true), which has been
> relatively database intensive on a traditional LAMP.
>
> Does anyone have experience running a similar setup?  Most of what I
> have seen GAE used for is much less database intensive.  Could it
> work?  Should I expect faster or slower responses from the LAMP?
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to