Does anyone know the recommended way for an ODB clients to receive a large query result (>10000 records)?
OSQLAsynchQuery seemed like an alternative to pagination (which produces odd behavior <http://stackoverflow.com/questions/37731791/orientdb-automatic-pagination-returning-duplicate-result-short-pages-and-an-in>), but OSQLAsynchQuery gives me this distressing warning if I ask for >10000 results? INFO: {db=jsondb} [TIP] Query 'SELECT FROM Thing' returned a result set with more than 10000 records. Check if you really need all these records, or reduce the resultset by using a LIMIT to improve both performance and used RAM My example is on SO here <http://stackoverflow.com/questions/37842775/orientdb-large-query-tips-how-to-safely-implement-them-or-not-supported> . Should I ignore the warning, or should I be doing something else? On Tuesday, June 14, 2016 at 8:37:50 AM UTC-7, Stuart Reynolds wrote: > > Bump! > > I have confirmed that pagination still has this odd (buggy?) behavior in > 2.1.19 and filed a bug. > https://github.com/orientechnologies/orientdb/issues/6298 > > > On Thursday, June 9, 2016 at 10:08:26 AM UTC-7, Stuart Reynolds wrote: >> >> (Sorry for the repost -- my original question was a mess. Have deleted it >> and am reposting). >> >> I'd like to iterate through a very large set of records in Orientdb. >> >> >> So that the result doesn't fill up my machine's memory, I've tried to >> implement paginated queries, but I seem to be getting back >> >> - - duplicated documents >> - - record sets shorter than the page size >> - - a infinite series of results >> >> The original Java method listed in the docs >> <http://orientdb.com/docs/last/Pagination.html> is as follows: >> >> >> OSQLSynchQuery<ODocument> query = new OSQLSynchQuery<ODocument>("select from >> Customer LIMIT 20"); >> for (List<ODocument> resultset = database.query(query); >> !resultset.isEmpty(); resultset = database.query(query)) { >> ... >> } >> >> >> I've implemented this as scala: >> >> >> val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5") >> var resultset = db.query[OResultSet[ODocument]](query) >> while (!resultset.isEmpty()) { >> // process result set here >> resultset = db.query(query) >> } >> >> >> Here's the full example >> >> >> def makeThing(x:Int) ={ >> val doc = new ODocument("Thing") >> doc.field("x",x) >> doc >> } >> >> val db: ODatabaseDocumentTx = new ODatabaseDocumentTx("memory:jsondb") >> db.create() >> db.set(MINIMUMCLUSTERS, 3) >> db.set(CLUSTERSELECTION, "round-robin") >> db.set(CONFLICTSTRATEGY, "content") >> db.set(CHARSET, "UTF-8") >> >> >> println("SAVING--------") >> >> for (x <- 0 until 12) { >> val doc:ODocument = makeThing(x) >> val saved = db.save[ODocument](doc) >> println(saved) >> } >> >> >> println("\n\nQUERYING--------") >> >> val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5") >> var resultset = db.query[OResultSet[ODocument]](query) >> while (!resultset.isEmpty()) { >> resultset.toArray.foreach(println) >> resultset = db.query(query) >> println("---------") >> } >> >> >> But here's the output: >> >> >> SAVING-------- >> Thing#9:0{x:0} v1 >> Thing#10:0{x:1} v1 >> Thing#11:0{x:2} v1 >> Thing#9:1{x:3} v1 >> Thing#10:1{x:4} v1 >> Thing#11:1{x:5} v1 >> Thing#9:2{x:6} v1 >> Thing#10:2{x:7} v1 >> Thing#11:2{x:8} v1 >> Thing#9:3{x:9} v1 >> Thing#10:3{x:10} v1 >> Thing#11:3{x:11} v1 >> >> >> >> QUERYING-------- >> Thing#9:0{x:0} v1 >> Thing#9:1{x:3} v1 >> Thing#9:2{x:6} v1 >> Thing#9:3{x:9} v1 >> Thing#10:0{x:1} v1 # So far, so good... >> --------- >> Thing#9:0{x:0} v1 # Already seen this >> Thing#10:1{x:4} v1 >> Thing#10:2{x:7} v1 >> Thing#10:3{x:10} v1 >> Thing#11:0{x:2} v1 >> --------- >> Thing#9:0{x:0} v1 # Already seen this >> Thing#11:1{x:5} v1 >> Thing#11:2{x:8} v1 >> Thing#11:3{x:11} v1 # Page cut short >> --------- >> Thing#9:0{x:0} v1 # Already seen this! >> --------- >> Thing#9:1{x:3} v1 >> Thing#9:2{x:6} v1 >> Thing#9:3{x:9} v1 >> Thing#10:0{x:1} v1 >> Thing#10:1{x:4} v1 >> >> >> >> Note that the DB is in memory, and no-one is simultaneously writing to >> the DB. >> >> >> Using ODB client 2.1.1 >> >> >> What's the sane and safe way to iterate through a very large dataset. As >> far as I can see, the method in the docs is buggy. >> >> >> -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
