Definitely interesting possibilities thank you!
Ultimately:
I think the coin() doesn't decrease our traversal time, just gets us a more 
random sample - if we're looking to get 10k results, with a 50% random we'd 
end up traversing ~20k edges. Within those edges it'd be random what we 
get, but the runtime shouldn't be much better

I'm not sure about shuffle - that may do it, but I believe that the end 
result of shuffling your results is that the sample you've taken gets 
shuffled - not that shuffle the set of documents you go through to get the 
result. By that I mean if you had a set [0,1,2,3,4,5,6,7,8,9], if you 
selected 4 records you'd get [0,1,2,3]. I think with shuffle you'd get 
something like [2,3,0,1], not [6,2,8,3].

I really like the last possibility - we get the size of the edge counts, 
and select either a set percentage or a set number of random ints from the 
range of [0-size()], then use that result to get what we need. I'm not 100% 
sure this is easy to implement through the java API but we'll see!

Still, a SQL query with a randomized selection would be a great thing to 
have.


Ultimately what we need to do is a weighted random - all our edges have 
weights, and I need to traverse the edges in a weighted random fashion. If 
we're able to implement this in a server side function, we'd be in a good 
spot for our query run time.


On Monday, September 7, 2015 at 2:26:02 AM UTC-7, MV-dev1 wrote:
>
> OK - first, take this information with a grain of salt because I'm new to 
> OrientDb and haven't actually rolled out a successful release but.....
>
> *Thought #1:  Get 'Groovy' with it....*
>
> I've been reading all the ways you can write server functions (store 
> procedures) and one is Groovy which seems to relate to or also be called 
> 'Gremlin' or 'TinkerPop' or 'Blueprints'.
>
> https://github.com/tinkerpop/blueprints/wiki/OrientDB-Implementation - 
> "Blueprints is the default Java API for OrientDB, so you don’t need to 
> include additional modules. For more information look at OrientDB 
> Blueprints API."
>
> I ran across this 'Coin Step' yesterday when scanning the TinkerPop3 
> documentation.
>
> From 
> http://tinkerpop.incubator.apache.org/docs/3.0.0-incubating/#coin-step
> Coin Step
>
> To randomly filter out a traverser, use the coin()-step (*filter*). The 
> provided double argument biases the "coin toss."
>
> e.g.  gremlin> g.V().coin(0.5)
> Order Step
>
> When the objects of the traversal stream need to be sorted, order()-step (
> *map*) can be leveraged.
>
> I noticed 'shuffle' -- "Randomizing the order of the traversers at a 
> particular point in the traversal is possible with Order.shuffle."
>
> e.g. gremlin> g.V().hasLabel('person').order().by(shuffle)
> Groovy statements do work as server functions so maybe this is something 
> that you could use?
>
> References.
>
>
>
> *Thought #2: 'JavaScript'*You can always write a store proc that 
> generates a random number 0 to size() and steps ahead that number of steps.
>
>
> *For anyone that knows, please correct me.*
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to