[jira] [Commented] (TINKERPOP-2203) Bind the console timeout to the request timeout

2019-04-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827111#comment-16827111
 ] 

ASF GitHub Bot commented on TINKERPOP-2203:
---

spmallette commented on pull request #1101: TINKERPOP-2203 Added console remote 
timeout to each request
URL: https://github.com/apache/tinkerpop/pull/1101
 
 
   https://issues.apache.org/jira/browse/TINKERPOP-2203
   
   Passes the console timeout to the server on each `:submit`:
   
   ```text
   gremlin> :remote config timeout 50
   ==>Set remote timeout to 50ms
   gremlin> :> Thread.sleep(1000);"back"
   Script evaluation exceeded the configured 'scriptEvaluationTimeout' 
threshold of 50 ms or evaluation was otherwise cancelled directly for request 
[Thread.sleep(1000);"back"]
   Type ':help' or ':h' for help.
   Display stack trace? [yN]y
   java.util.concurrent.TimeoutException: Script evaluation exceeded the 
configured 'scriptEvaluationTimeout' threshold of 50 ms or evaluation was 
otherwise cancelled directly for request [Thread.sleep(1000);"back"]
at 
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$1(GremlinExecutor.java:315)
at 
io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at 
io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:125)
at 
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:465)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
at java.lang.Thread.run(Thread.java:748)
   ```
   
   All good with `mvn clean install && mvn verify -pl gremlin-console 
-DskipIntegrationTests=false`.
   
   VOTE +1
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Bind the console timeout to the request timeout
> ---
>
> Key: TINKERPOP-2203
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2203
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: console
>Affects Versions: 3.3.6
>Reporter: stephen mallette
>Assignee: stephen mallette
>Priority: Major
> Fix For: 3.4.2
>
>
> {{:remote config timeout x}} sets a client side timeout but doesn't override 
> the timeout on the server so if the timeout on the client is shorter than 
> what's on the server then the server will keep processing even though the 
> client is long done waiting. On the flip side, if the server is shorter than 
> the client it will timeout sooner than expected. 
> In 3.4.0 we introduced some changes to the driver API which would make this 
> work better, allowing the timeout to be easily set from the client side. 
> Simply need to use that to make the server aware of the timeout the client is 
> using.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Using a bot to keep dependencies up to date

2019-04-26 Thread Stephen Mallette
Thanks again for investigating.

> I can try that out and if it's actually helpful add something to our
> development docs. Ideally we have simple commands to run for all GLVs to
> keep them updated in the future.

+1 for this approach for sure though :)

On Fri, Apr 26, 2019 at 11:49 AM Florian Hockmann 
wrote:

> I just reached out to ASF Infra and that basically concluded this
> discussion as they simply don't allow any GitHub bots that require write
> access to the repo which includes dependabot unfortunately.
>
> We could only enable it on a fork and then forward the PRs from there to
> the main repo which we could also automate in principal, but at least
> for now this sounds like too much work.
>
> So, I guess we have to find other ways to keep the dependencies up to
> date, especially for the GLVs.
>
> At least for .NET, I know that there is a nice tool we could use that
> can list outdated dependencies and it should also be able to perform an
> update: https://github.com/jerriep/dotnet-outdated
>
> I can try that out and if it's actually helpful add something to our
> development docs. Ideally we have simple commands to run for all GLVs to
> keep them updated in the future.
>
> Am 16.04.2019 um 13:39 schrieb Stephen Mallette:
> >>  We can define the target branch for the PRs. So, we could set that to
> > tp33 and then merge though to master like we usually do it for other PRs.
> > (Maybe it's also possible to create different configurations for the
> > different branches if we want to get one PR per branch.)
> >
> > There are different dependencies on tp33 vs master so it would be nice to
> > know for sure how they handle that. Of course, you allude to a
> > proliferation of PRs that would resultugh. Anyway, I guess I'd like
> to
> > know what the options really are there...
> >
> >> I assume that Apacha Infra will have to activate GitHub bots like
> > dependabot for us as that requires ownership of the GitHub organization.
> I
> > hope that they can afterwards give us permissions to change the settings,
> > but I'm not sure about that. Maybe these permissions also only work on an
> > organization basis (which would include all ASF repos). Worst case would
> be
> > that ASF Infra has to configure such a bot for us.
> >
> > all good questions to answer. no one seem to really be objecting to this
> > direction (i only want to know more about the details of how it will
> work)
> > so I guess we're at a point where you could try to move things forward
> with
> > infra and the bot folks to see how these final issues we're discussing
> > would work out. please let us know what you find out. thanks.
> >
> > On Wed, Apr 10, 2019 at 9:52 AM Florian Hockmann  >
> > wrote:
> >
> >>> always using the most recent has been disastrous in python. our build
> >> breaks all the time with no changes from us because of that style where
> we
> >> don't pin to specific dependencies. i don't understand that model at
> all.
> >> i
> >> know you're not saying that we blindly upgrade, i was just making a
> point
> >> about python that is semi-related.
> >>
> >> Understood and I agree completely that we should pin versions.
> >>
> >>> we're not always sure of what a change in dependency will bring in
> terms
> >> of change to the API but i agree that upgrades can take place (and as
> you
> >> pointed out at the start, are already taking place in a more manual
> >> fashion).
> >>
> >> IF our dependencies use semantic versioning, then this shouldn't be a
> >> problem as long as we don't do major version updates in our patch
> >> versions.
> >>
> >>> Why no PR for spark, hadoop, etc? not that they would pass compilation
> -
> >> i'd expect failures, but I'm just wondering if you knew the reason it
> >> didn't catch those?
> >>
> >> I also wondered about that as I had exactly the same expectation of
> >> immediately getting a failed build for these dependencies. Maybe they
> have
> >> a list of dependencies where they don't attempt an update? We can of
> >> course
> >> ask them about this if we decide to use this bot.
> >>
> >>> it looks like it submits all PRs against master. what about tp33? will
> >> we have to cherry-pick out of the PR to tp33, test and then merge
> forward
> >> to master?
> >>
> >> We can define the target branch for the PRs. So, we could set that to
> tp33
> >> and then merge though to master like we usually do it for other PRs.
> >> (Maybe
> >> it's also possible to create different configurations for the different
> >> branches if we want to get one PR per branch.)
> >>
> >>> i assume that you setup an account for dependabot that gets it
> >> configured at our repo? I assume that the account is bound to your
> github
> >> account?
> >>
> >> Yes, I enabled dependabot for my account and could then give it access
> to
> >> my own repos which is a fork of our TinkerPop repo in this case.
> >>
> >>> can that dependabot dashboard for "tinkerpop" be accessed by others or
> >> just your account? how will 

Re: Using a bot to keep dependencies up to date

2019-04-26 Thread Florian Hockmann
I just reached out to ASF Infra and that basically concluded this
discussion as they simply don't allow any GitHub bots that require write
access to the repo which includes dependabot unfortunately.

We could only enable it on a fork and then forward the PRs from there to
the main repo which we could also automate in principal, but at least
for now this sounds like too much work.

So, I guess we have to find other ways to keep the dependencies up to
date, especially for the GLVs.

At least for .NET, I know that there is a nice tool we could use that
can list outdated dependencies and it should also be able to perform an
update: https://github.com/jerriep/dotnet-outdated

I can try that out and if it's actually helpful add something to our
development docs. Ideally we have simple commands to run for all GLVs to
keep them updated in the future.

Am 16.04.2019 um 13:39 schrieb Stephen Mallette:
>>  We can define the target branch for the PRs. So, we could set that to
> tp33 and then merge though to master like we usually do it for other PRs.
> (Maybe it's also possible to create different configurations for the
> different branches if we want to get one PR per branch.)
>
> There are different dependencies on tp33 vs master so it would be nice to
> know for sure how they handle that. Of course, you allude to a
> proliferation of PRs that would resultugh. Anyway, I guess I'd like to
> know what the options really are there...
>
>> I assume that Apacha Infra will have to activate GitHub bots like
> dependabot for us as that requires ownership of the GitHub organization. I
> hope that they can afterwards give us permissions to change the settings,
> but I'm not sure about that. Maybe these permissions also only work on an
> organization basis (which would include all ASF repos). Worst case would be
> that ASF Infra has to configure such a bot for us.
>
> all good questions to answer. no one seem to really be objecting to this
> direction (i only want to know more about the details of how it will work)
> so I guess we're at a point where you could try to move things forward with
> infra and the bot folks to see how these final issues we're discussing
> would work out. please let us know what you find out. thanks.
>
> On Wed, Apr 10, 2019 at 9:52 AM Florian Hockmann 
> wrote:
>
>>> always using the most recent has been disastrous in python. our build
>> breaks all the time with no changes from us because of that style where we
>> don't pin to specific dependencies. i don't understand that model at all. 
>> i
>> know you're not saying that we blindly upgrade, i was just making a point
>> about python that is semi-related.
>>
>> Understood and I agree completely that we should pin versions.
>>
>>> we're not always sure of what a change in dependency will bring in terms
>> of change to the API but i agree that upgrades can take place (and as you
>> pointed out at the start, are already taking place in a more manual
>> fashion).
>>
>> IF our dependencies use semantic versioning, then this shouldn't be a
>> problem as long as we don't do major version updates in our patch 
>> versions.
>>
>>> Why no PR for spark, hadoop, etc? not that they would pass compilation -
>> i'd expect failures, but I'm just wondering if you knew the reason it
>> didn't catch those?
>>
>> I also wondered about that as I had exactly the same expectation of
>> immediately getting a failed build for these dependencies. Maybe they have
>> a list of dependencies where they don't attempt an update? We can of 
>> course
>> ask them about this if we decide to use this bot.
>>
>>> it looks like it submits all PRs against master. what about tp33? will
>> we have to cherry-pick out of the PR to tp33, test and then merge forward
>> to master?
>>
>> We can define the target branch for the PRs. So, we could set that to tp33
>> and then merge though to master like we usually do it for other PRs. 
>> (Maybe
>> it's also possible to create different configurations for the different
>> branches if we want to get one PR per branch.)
>>
>>> i assume that you setup an account for dependabot that gets it
>> configured at our repo? I assume that the account is bound to your github
>> account?
>>
>> Yes, I enabled dependabot for my account and could then give it access to
>> my own repos which is a fork of our TinkerPop repo in this case.
>>
>>> can that dependabot dashboard for "tinkerpop" be accessed by others or
>> just your account? how will that work?
>>
>> I assume that Apacha Infra will have to activate GitHub bots like
>> dependabot for us as that requires ownership of the GitHub organization. I
>> hope that they can afterwards give us permissions to change the settings,
>> but I'm not sure about that. Maybe these permissions also only work on an
>> organization basis (which would include all ASF repos). Worst case would 
>> be
>> that ASF Infra has to configure such a bot for us.
>>
>>> i don't see much information about what the dashboard allows users 

[jira] [Created] (TINKERPOP-2203) Bind the console timeout to the request timeout

2019-04-26 Thread stephen mallette (JIRA)
stephen mallette created TINKERPOP-2203:
---

 Summary: Bind the console timeout to the request timeout
 Key: TINKERPOP-2203
 URL: https://issues.apache.org/jira/browse/TINKERPOP-2203
 Project: TinkerPop
  Issue Type: Improvement
  Components: console
Affects Versions: 3.3.6
Reporter: stephen mallette
Assignee: stephen mallette
 Fix For: 3.4.2


{{:remote config timeout x}} sets a client side timeout but doesn't override 
the timeout on the server so if the timeout on the client is shorter than 
what's on the server then the server will keep processing even though the 
client is long done waiting. On the flip side, if the server is shorter than 
the client it will timeout sooner than expected. 

In 3.4.0 we introduced some changes to the driver API which would make this 
work better, allowing the timeout to be easily set from the client side. Simply 
need to use that to make the server aware of the timeout the client is using.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [TinkerPop] Re: A TP4 Structure Agnostic Bytecode Specification (The Universal Structure)

2019-04-26 Thread Joshua Shinavier
These past few days, I have had some requests for a more detailed write-up
of the data model, so here goes. See also my Global Graph Summit
presentation

.

*Algebraic data types*

The basic idea of this data model, which I have nicknamed *algebraic
property graphs*, is that an "ordinary" property graph schema is just a
special case of a broader class of relational schemas with primary and
foreign keys. What are edges if not relations between vertices? What are
properties if not relations between elements (edges or vertices) and
primitive values? In this model, each edge label and property key
identifies a distinct relation in a graph. Vertex types identify unary
relations, i.e. sets.

For example, in the schema of the TinkerPop classic graph, below, Person
and Project are distinct vertex types, knows and created are distinct edge
types, etc. The primitive types are drawn in blue/purple, the vertex types
are salmon-colored, the edge types are yellow, and the property types are
green. The "o" and "i" ports on the boxes represent the "out" (tail) and
"in" (head) component of each edge type or property type.


[image: image.png]


Some details which should stand out visually:
1) In a typical property graph like this one, each type must be placed at
one of three levels: primitive or vertex, vertex property or edge, or edge
property. Vertex meta-properties would be at the third level as well, if
this graph had any. All projections (arrows between types) run from higher
levels to lower levels.
2) Primitive types and vertex types have no projections; all other types
have two. Element ids (i.e. primary keys) are not depicted.
3) Some ports have more than one outgoing arrow. This represents
*disjunction*, e.g. a weight property can be applied *either* to a knows
edge *or* a created edge.

Although disjoint unions may not be common in relational schemas (because
they introduce complexity), they are necessary for supporting
general-purpose algebraic data types
, which are fundamental
to a broad swath of data models which we support at Uber, and which we
would like to support in TinkerPop4.

As we expand beyond vanilla property graphs, we quickly get into
greater-than-binary relations such as this hyper-edge type

:

[image: image.png]


The type is drawn in a different color to indicate that it is neither a
vertex type (an element with no projections), a property type (an element
with projections to another element and a primitive value) or an edge (an
element with projections to two other elements). It is simply an element.
The guiding principle here is similar to that of TinkerPop3's Graph.Features
:
start with a maximally expressive data model, then refine the data model
for a particular context by adding constraints. Some examples of
schema-level constraints:

*) May a type have more than two projections? I.e. are hyper-edges  / n-ary
relations  supported?
*) Can edge types depend on other edge types? I.e. are meta-edges
(sometimes confusingly called hyperedges) supported?
*) Can property types depend on other property types? I.e. are
meta-properties

supported?
*) Does every relation type of arity >= 2 need to have a primary key? I.e.
are compound data types supported (e.g. lat/lon pairs, records like
addresses with multiple fields)?
*) Are recursive / self-referential types (e.g. lists or trees of elements
or primitives) allowed?
etc.

There are also constraints which apply at the instance level, e.g.

*) May a graph contain two edges which differ only by id? I.e. are non-
simple

edges supported?
*) May a graph contain two properties which differ only by id? I.e. are
multi-properties

supported?
*) May a generalized property or edge instance reference itself?
etc.

With the right set of constraints, we obtain a basic property graph data
model. Relaxing the constraints, we can define and manipulate datasets
which are definitely not basic property graphs, but which are nonetheless
graph-like, and for which it makes sense to perform graph traversals. Enter
TinkerPop4.


[image: image.png]


*Graph traversal as relational algebra*

What are the fundamental operations 

Re: [DISCUSS] The Two Protocols of TP4

2019-04-26 Thread Jorge Bay Gondra
> are you saying that we should write an ANTLR parser that compiles
Gremlin-XXX into Bytecode directly?

Not exactly.

Currently users can send either bytecode or groovy scripts to be executed
on the server. I'm saying we replace "groovy scripts evaluation" with
"gremlin groovy traversal execution".

In TP3, it's possible for the user to submit to the script engine something
like "Thread.sleep(4000)" that will be executed inside a sandboxed vm.
I'm proposing we get rid of this approach in TP4 and, as gremlin groovy
script are still useful (for example, you can store a bunch of traversals
to execute in a text file), we replace it with a language recognition
engine that will parse what is sent and evaluate it, using a restricted
grammar set. The variant for gremlin strings would still be groovy/java but
the user won't be able to submit arbitrary groovy instructions.

I think this is not directly related to this thread (sorry!), do you think
I should start a new one to discuss this?

Jorge

On Tue, Apr 23, 2019 at 1:14 PM Marko Rodriguez 
wrote:

> Whoa! — are you saying that we should write an ANTLR parser that compiles
> Gremlin-XXX into Bytecode directly?
>
> Thus, for every Gremlin language variant, we will have an ANTLR parser.
>
> Marko.
>
> http://rredux.com 
>
>
>
>
> > On Apr 23, 2019, at 5:01 AM, Jorge Bay Gondra 
> wrote:
> >
> > Hi,
> > Language recognition engines will give us a set of tokens, usually in
> some
> > sort of tree but the result can be thought of nested collections, for
> > example:
> >
> > The following string "g.V().values('name')" could be parsed into
> something
> > like [["g"], ["V"], ["values", "name"]].
> >
> > Then, we would have to create some sort of "evaluator", that translates
> > these string tokens into a traversal, similar to bytecode parsing and
> > execution. This evaluator can use static evaluation of the tokens (like,
> do
> > the tokens evaluate into something meaningful?), can be optimized with
> > caching techniques (like preparing traversals) and more importantly, will
> > only execute class methods that are whitelisted, i.e., users can't use it
> > to execute arbitrary groovy code.
> >
> > Best,
> > Jorge
> >
> >
> > On Tue, Apr 23, 2019 at 12:36 PM Marko Rodriguez  >
> > wrote:
> >
> >> Hi Jorge,
> >>
> >>> Instead of supporting a ScriptEngine or enable providers to implement
> >> one,
> >>> TP4 could be a good opportunity to ditch script engines while continue
> >>> supporting gremlin-groovy string literals using language recognition
> >>> engines like ANTLR.
> >>
> >> Huh…….. Can you explain how you think of using ANTLR vs
> >> ScriptEngine.submit(String)
> >>
> >>> Language recognition and parsing engines have several benefits over the
> >>> current approach, most notably that it's safe to parse text using
> >> language
> >>> recognition as it results in string tokens, opposed to let users run
> code
> >>> in a sandboxed vm.
> >>
> >> How would the ANTLR-parsed text ultimately be executed?
> >>
> >> Thanks,
> >> Marko.
> >>
> >> http://rredux.com   http://rredux.com/>>
>
>


Re: What makes 'graph traversals' and 'relational joins' the same?

2019-04-26 Thread Stephen Mallette
Trying to catch up on threads a bit...enjoying the discussion and I hope
I'm following along fully because it's sounding really nice. Letting the
type system be so open in previous versions of TinkerPop has created so
many inconsistencies and inelegant solutions which have only be exaggerated
by Gremlin Language Variants. Anyway, regarding:

>> 5. ComplexTypes don’t go over the wire — a ComplexTypeProxy with
>> appropriately provided toString() is all that leaves the TP4 VM.
>>

> As a tuple, ComplexTypes / ADTs go over the wire. The values of their
> primitive fields should probably go with them. However, the values of
their
> element / entity fields are just references; the attached element doesn't
> go with them.

I think I'd agree with Josh that we'd send these back over the wire,
especially if there is agreement that they are just a tuple form which
means that providers won't need to get into low-level serializer
development for custom types. TinkerPop would just know how to deal with
them for network transport. I guess providers would just have to provide
libraries with the ComplexType/ADT implementations in the programming
languages they wanted to support. In cases where they didn't, a user could
be left to work with a raw TinkerPop ComplexType/ADT instance which could
arguably be a better state than where they are left now which would be
serialization errors.



On Thu, Apr 25, 2019 at 2:07 PM Joshua Shinavier  wrote:

> Hi Marko. Responses inline.
>
> On Wed, Apr 24, 2019 at 10:30 AM Marko Rodriguez 
> wrote:
>
> > Hi,
> >
> > I think I understand you now. The concept of local and non-local data is
> > what made me go “ah!”
> >
>
> Nice. I also brought this up yesterday in the Property Graph Schema Working
> Group, where there is a discussion going on about whether/how graph
> databases can contain multiple graphs. Can an element belong to multiple
> graphs, can it have different properties in different graphs, etc. If each
> graph element is atomic, referencing other graph elements but not
> containing them, then it is very straightforward to think of a property
> graph as a simple set of elements. Graph relations are just set relations,
> making it easy to pull graphs apart and put graphs together (e.g. when
> building a stream, merging streams, etc.). If you are willing to make the
> open world assumption (e.g. "I know e[7] is a 'knows' edge, but I don't
> know what its out- and in-vertices are"), then you can't even partition a
> graph in such a way that the partitions are not valid graphs.
>
>
> So let me reiterate what I think you are saying.
> >
> > v[1] is guaranteed to have its id data local to it. All other information
> > could be derived via id-based "equi-joins.” Thus, we can’t assume that a
> > vertex will always have its properties and edges co-located with it.
>
>
> Yes indeed. A particular graph vendor may choose to co-locate properties
> with a vertex and edges with out- or in-vertex (or both, e.g. as JanusGraph
> does), but this is an optimization. At a logical level, you can think of an
> element and its dependents as belonging to separate relations.
>
>
>
> > However, we can assume that it knows where to get its property and edge
> > data when requested.
>
>
> Yes; you need to be able to select().
>
>
>
> > Assume the following RDBMS-style data structure that is referenced by
> > com.example.MyGraph.
> >
> > vertex_table
> > id label
> > 1  person
> > 2  person
> > …
>
>
> That is one way to go. I believe this scheme is what Ryan and David would
> call the Grothendieck construction; all relations of a given arity are
> marked with their type and concatenated into a single relation. I am still
> a little sketchy on the Grothendieck construction, so I hope that is a
> correct statement.
>
> However, you can also think of distinct element types (edge labels, vertex
> labels, property keys, hyperedge signatures) as distinct relations. So you
> instead of vertex_table, you would have
>
> person_table
> id
> 1
> 2
>
> Vertices are such trivial relations that they don't need to be stored as a
> tables. Edges are more interesting:
>
> knows_table
> out in
> 1 2
> 1 4
>
> Properties are similar:
>
> name_table
> out out_label out in
> 1 person marko
> 2 person vadas
> 3 project lop
> 4 person josh
> 5 project ripple
> 6 person peter
>
> The property table has a bit of a twist, because its out-label is a
> disjoint union of "person" and "project"; both persons and projects can
> have names, so you tag the out-element with its label/type. This is not
> necessary for "knows" because the out-label is always "person".
>
>
>
> > properties_table
> > id  name   age
> > 1   marko  29
> > 2   josh   35
> > …
> >
> > edge_table
> > id outV  label  inV
> > 0  1knows   2
> > …
> >
>
> Yes, this also works, and is equivalent to what I wrote above, with one
> tweak: if tagged unions are supported (which IMO they should be, so we have
> both a "times" and a "plus" in our type algebra), 

Re: TP4 Processors now support both push- and pull-based semantics.

2019-04-26 Thread bryncooke



On 2019/04/24 12:19:54, Marko Rodriguez  wrote: 
> Hello,
> 
> > I think it would be better to either expose Flowable on the API (or Flow if 
> > you don't want to be tied in to RxJava)
> 
> We definitely don’t want to expose anything “provider specific.” Especially 
> at the Processor interface level. I note your Flow API reference in 
> java.concurrent and have noticed that RxJava mimics many java.concurrent 
> classes (Subscriber, Subscription, etc.). I will dig deeper.
> 
> > 1. Using Consumer will break the Rx chain. This is undesirable as it will 
> > prevent backpressure and cancellation from working properly.
> 
> Understood about breaking the chain.
> 
> > 2. The Scheduler to run the traversal on can be set. For instance, in the 
> > case where only certain threads are allowed to perform IO once the user has 
> > the Flowable they can call subscribeOn before subscribe.
> > 3. Backpressure strategy can be set, such as dropping results on buffer 
> > overflow.
> > 4. Buffer size can be set.
> 
> Hm. Here are my thoughts on the matter.
> 
> RxJava is just one of many Processors that will interact with TP4. If we 
> start exposing backpressure strategies, buffer sizes, etc. at the Processor 
> API level, then we expect other providers to have those concepts. Does Spark 
> support backpressure? Does Hadoop? Does Pipes? ...
> 
> I believe such provider-specific parameterization should happen via 
> language-agnostic configuration. For instance:
> 
> g = g.withProcessor(RxJavaProcessor.class, Map.of(“rxjava.backpressure”, 
> “drop”, “rxjava.bufferSize”, 2000))
> g.V().out().blah()
> 
> Unlike TP3, TP4 users will never interact with our Java API. They will never 
> have a reference to a Processor instance. They only talk to the TP4 VM via 
> Bytecode. However, with that said, systems that will integrate the TP4 VM 
> (e.g. database vendors, data server systems, etc.) will have to handle 
> Processor traverser results in some way (i.e. within Java). Thus, if they are 
> a Reactive architecture, then they will want to be able to Flow, but we need 
> to make sure that java.concurrent Flow semantics doesn't go too far in 
> demanding “unreasonable” behaviors from other Processor implementations. (I 
> need to study the java.concurrent Flow API)
> 
> Thus, I see it like this:
> 
>   1. RxJava specific configuration is not available at the Process API 
> level (only via configuration).
>   2. Drop Consumer and expose java.concurrent Flow in Processor so the 
> chain isn’t broken for systems integrating the TP4 VM.
>   - predicated on java.concurrent Flow having reasonable 
> expectations of non-reactive sources (i.e. processors).
> 
> Does this make sense to you?
> 
> ———
> 
> Stephen said you made a comment regarding ParallelRxJava as not being 
> necessary. If this is a true statement, can you explain your thoughts on 
> ParallelRxJava. My assumptions regarding serial vs. parallel:
> 
>   1. For TP4 VM vendors in a highly concurrent, multi-user environment, 
> multi-threading individual queries is bad.
>   2. For TP4 VM vendors in a lowly concurrent, limited-user environment, 
> multi-threading a single query is good.
>   - also related to the workload — e.g. ParallelRxJava for an AI 
> system where one query at a time is happening over lots of data.
> 
> Thank you for your feedback,
> Marko.
> 
> http://rredux.com 
> 
> 
> 
> 
> > On Apr 24, 2019, at 3:41 AM, brynco...@gmail.com wrote:
> > 
> > 
> > 
> > On 2019/04/23 13:07:09, Marko Rodriguez  > > wrote: 
> >> Hi,
> >> 
> >> Stephen and Bryn were looking over my RxJava implementation the other day 
> >> and Bryn, with his British accent, was like [I paraphrase]:
> >> 
> >>“Whoa dawg! Bro should like totally not be blocking to fill an 
> >> iterator. Gnar gnar for surezies.”
> >> 
> >> Prior to now, Processor implemented Iterator, where for RxJava, 
> >> when you do next()/hasNext() if there were no results in the queue and the 
> >> flowable was still running, then the iterator while()-blocks waiting for a 
> >> result or for the flowable to terminate.
> >> 
> >> This morning I decided to redo the Processor interface (and respective 
> >> implementations) and it is much nicer now. We have two “execute” methods:
> >> 
> >> IteratorProcessor.iterator(Iterator starts)
> >> void Processor.subscribe(Iterator starts, Consumer 
> >> consumer)
> >> 
> >> A processor can only be executed using one of the methods above. Thus, 
> >> depending on context and the underlying processor, the VM determines 
> >> whether to use pull-based or push-based semantics. Pretty neat, eh?
> >> 
> >>
> >> https://github.com/apache/tinkerpop/blob/tp4/java/machine/machine-core/src/main/java/org/apache/tinkerpop/machine/processor/Processor.java
> >>  
> >>