Bump! I have confirmed that pagination still has this odd (buggy?) behavior in 2.1.19 and filed a bug. https://github.com/orientechnologies/orientdb/issues/6298
On Thursday, June 9, 2016 at 10:08:26 AM UTC-7, Stuart Reynolds wrote: > > (Sorry for the repost -- my original question was a mess. Have deleted it > and am reposting). > > I'd like to iterate through a very large set of records in Orientdb. > > > So that the result doesn't fill up my machine's memory, I've tried to > implement paginated queries, but I seem to be getting back > > - - duplicated documents > - - record sets shorter than the page size > - - a infinite series of results > > The original Java method listed in the docs > <http://orientdb.com/docs/last/Pagination.html> is as follows: > > > OSQLSynchQuery<ODocument> query = new OSQLSynchQuery<ODocument>("select from > Customer LIMIT 20"); > for (List<ODocument> resultset = database.query(query); !resultset.isEmpty(); > resultset = database.query(query)) { > ... > } > > > I've implemented this as scala: > > > val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5") > var resultset = db.query[OResultSet[ODocument]](query) > while (!resultset.isEmpty()) { > // process result set here > resultset = db.query(query) > } > > > Here's the full example > > > def makeThing(x:Int) ={ > val doc = new ODocument("Thing") > doc.field("x",x) > doc > } > > val db: ODatabaseDocumentTx = new ODatabaseDocumentTx("memory:jsondb") > db.create() > db.set(MINIMUMCLUSTERS, 3) > db.set(CLUSTERSELECTION, "round-robin") > db.set(CONFLICTSTRATEGY, "content") > db.set(CHARSET, "UTF-8") > > > println("SAVING--------") > > for (x <- 0 until 12) { > val doc:ODocument = makeThing(x) > val saved = db.save[ODocument](doc) > println(saved) > } > > > println("\n\nQUERYING--------") > > val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5") > var resultset = db.query[OResultSet[ODocument]](query) > while (!resultset.isEmpty()) { > resultset.toArray.foreach(println) > resultset = db.query(query) > println("---------") > } > > > But here's the output: > > > SAVING-------- > Thing#9:0{x:0} v1 > Thing#10:0{x:1} v1 > Thing#11:0{x:2} v1 > Thing#9:1{x:3} v1 > Thing#10:1{x:4} v1 > Thing#11:1{x:5} v1 > Thing#9:2{x:6} v1 > Thing#10:2{x:7} v1 > Thing#11:2{x:8} v1 > Thing#9:3{x:9} v1 > Thing#10:3{x:10} v1 > Thing#11:3{x:11} v1 > > > > QUERYING-------- > Thing#9:0{x:0} v1 > Thing#9:1{x:3} v1 > Thing#9:2{x:6} v1 > Thing#9:3{x:9} v1 > Thing#10:0{x:1} v1 # So far, so good... > --------- > Thing#9:0{x:0} v1 # Already seen this > Thing#10:1{x:4} v1 > Thing#10:2{x:7} v1 > Thing#10:3{x:10} v1 > Thing#11:0{x:2} v1 > --------- > Thing#9:0{x:0} v1 # Already seen this > Thing#11:1{x:5} v1 > Thing#11:2{x:8} v1 > Thing#11:3{x:11} v1 # Page cut short > --------- > Thing#9:0{x:0} v1 # Already seen this! > --------- > Thing#9:1{x:3} v1 > Thing#9:2{x:6} v1 > Thing#9:3{x:9} v1 > Thing#10:0{x:1} v1 > Thing#10:1{x:4} v1 > > > > Note that the DB is in memory, and no-one is simultaneously writing to the > DB. > > > Using ODB client 2.1.1 > > > What's the sane and safe way to iterate through a very large dataset. As > far as I can see, the method in the docs is buggy. > > > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
