Bump!

I have confirmed that pagination still has this odd (buggy?) behavior in 
2.1.19 and filed a bug.
https://github.com/orientechnologies/orientdb/issues/6298


On Thursday, June 9, 2016 at 10:08:26 AM UTC-7, Stuart Reynolds wrote:
>
> (Sorry for the repost -- my original question was a mess. Have deleted it 
> and am reposting).
>
> I'd like to iterate through a very large set of records in Orientdb.
>
>
> So that the result doesn't fill up my machine's memory, I've tried to 
> implement paginated queries, but I seem to be getting back
>
>    -  - duplicated documents
>    -  - record sets shorter than the page size
>    -  - a infinite series of results
>
> The original Java method listed in the docs 
> <http://orientdb.com/docs/last/Pagination.html> is as follows:
>
>
> OSQLSynchQuery<ODocument> query = new OSQLSynchQuery<ODocument>("select from 
> Customer LIMIT 20");
> for (List<ODocument> resultset = database.query(query); !resultset.isEmpty(); 
> resultset = database.query(query)) {
>     ...
> }
>
>
> I've implemented this as scala:
>
>
> val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5")
> var resultset = db.query[OResultSet[ODocument]](query)
> while (!resultset.isEmpty()) {
>   // process result set here
>   resultset = db.query(query)
> }
>
>
> Here's the full example
>
>
> def makeThing(x:Int) ={
>   val doc = new ODocument("Thing")
>   doc.field("x",x)
>   doc
> }
>
> val db: ODatabaseDocumentTx = new ODatabaseDocumentTx("memory:jsondb")
> db.create()
> db.set(MINIMUMCLUSTERS, 3)
> db.set(CLUSTERSELECTION, "round-robin")
> db.set(CONFLICTSTRATEGY, "content")
> db.set(CHARSET, "UTF-8")
>
>
> println("SAVING--------")
>
> for (x <- 0 until 12) {
>   val doc:ODocument = makeThing(x)
>   val saved = db.save[ODocument](doc)
>   println(saved)
> }
>
>
> println("\n\nQUERYING--------")
>
> val query = new OSQLSynchQuery[ODocument]("select from Thing LIMIT 5")
> var resultset = db.query[OResultSet[ODocument]](query)
> while (!resultset.isEmpty()) {
>   resultset.toArray.foreach(println)
>   resultset = db.query(query)
>   println("---------")
> }
>
>
> But here's the output:
>
>
> SAVING--------
> Thing#9:0{x:0} v1
> Thing#10:0{x:1} v1
> Thing#11:0{x:2} v1
> Thing#9:1{x:3} v1
> Thing#10:1{x:4} v1
> Thing#11:1{x:5} v1
> Thing#9:2{x:6} v1
> Thing#10:2{x:7} v1
> Thing#11:2{x:8} v1
> Thing#9:3{x:9} v1
> Thing#10:3{x:10} v1
> Thing#11:3{x:11} v1
>
>
>
> QUERYING--------
> Thing#9:0{x:0} v1
> Thing#9:1{x:3} v1
> Thing#9:2{x:6} v1
> Thing#9:3{x:9} v1
> Thing#10:0{x:1} v1  # So far, so good...
> ---------
> Thing#9:0{x:0} v1   # Already seen this
> Thing#10:1{x:4} v1
> Thing#10:2{x:7} v1
> Thing#10:3{x:10} v1
> Thing#11:0{x:2} v1
> ---------
> Thing#9:0{x:0} v1    # Already seen this
> Thing#11:1{x:5} v1
> Thing#11:2{x:8} v1
> Thing#11:3{x:11} v1  # Page cut short
> ---------
> Thing#9:0{x:0} v1   # Already seen this!
> ---------
> Thing#9:1{x:3} v1
> Thing#9:2{x:6} v1
> Thing#9:3{x:9} v1
> Thing#10:0{x:1} v1
> Thing#10:1{x:4} v1
>
>
>
> Note that the DB is in memory, and no-one is simultaneously writing to the 
> DB.
>
>
> Using ODB client 2.1.1
>
>
> What's the sane and safe way to iterate through a very large dataset. As 
> far as I can see, the method in the docs is buggy.
>
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to