[jira] [Created] (TINKERPOP-2107) Spark fails to reattach properties

2018-11-30 Thread Daniel Kuppitz (JIRA)
Daniel Kuppitz created TINKERPOP-2107:
-

 Summary: Spark fails to reattach properties
 Key: TINKERPOP-2107
 URL: https://issues.apache.org/jira/browse/TINKERPOP-2107
 Project: TinkerPop
  Issue Type: Bug
  Components: process
Affects Versions: 3.3.4
Reporter: Daniel Kuppitz


The traversal {{g.V().outE().properties().dedup()}} throws a 
{{NullPointerException}} when run in Spark OLAP. I created a branch for this 
ticket and added a failing test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TINKERPOP-2051) Uniqueness of property ids

2018-11-30 Thread stephen mallette (JIRA)


[ 
https://issues.apache.org/jira/browse/TINKERPOP-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16704594#comment-16704594
 ] 

stephen mallette commented on TINKERPOP-2051:
-

This 
[DISCUSS|https://lists.apache.org/thread.html/e9dcd998901ca24c14111d1a3d99d702da3ccf8ab5526e7640442d0a@%3Cdev.tinkerpop.apache.org%3E]
 thread explains why we've abandoned here for now. See this addition to the TP4 
doc for more information:

https://github.com/apache/tinkerpop/commit/855636d46aeb0c48930b6f8c65e5c3831425d5f9

> Uniqueness of property ids
> --
>
> Key: TINKERPOP-2051
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2051
> Project: TinkerPop
>  Issue Type: Bug
>  Components: structure
>Affects Versions: 3.2.9
>Reporter: Daniel Kuppitz
>Assignee: Daniel Kuppitz
>Priority: Major
>  Labels: breaking
>
> Right now we don't ensure property id uniqueness. As shown in [this 
> discussion|https://lists.apache.org/thread.html/e28d61e1f3b674b617765a0f28174d45691db0812e7c56761d1456c3@%3Cdev.tinkerpop.apache.org%3E],
>  this can lead to very odd results, hence I marked this ticket as a Bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (TINKERPOP-2051) Uniqueness of property ids

2018-11-30 Thread stephen mallette (JIRA)


 [ 
https://issues.apache.org/jira/browse/TINKERPOP-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette closed TINKERPOP-2051.
---
Resolution: Later

> Uniqueness of property ids
> --
>
> Key: TINKERPOP-2051
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2051
> Project: TinkerPop
>  Issue Type: Bug
>  Components: structure
>Affects Versions: 3.2.9
>Reporter: Daniel Kuppitz
>Assignee: Daniel Kuppitz
>Priority: Major
>  Labels: breaking
>
> Right now we don't ensure property id uniqueness. As shown in [this 
> discussion|https://lists.apache.org/thread.html/e28d61e1f3b674b617765a0f28174d45691db0812e7c56761d1456c3@%3Cdev.tinkerpop.apache.org%3E],
>  this can lead to very odd results, hence I marked this ticket as a Bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Uniqueness of property ids

2018-11-30 Thread Stephen Mallette
Things became even more dark the further down the rabbit hole this one
went. Exposed inconsistencies between OLAP and OLTP around
properties().dedup() and further inconsistencies with edge properties vs
vertex properties (who knows what happens with meta properties).
Ultimately, even with 3.4.0 allowing for breaking changes I think that this
one needs longer than the few weeks we have left before release of 3.4.0 to
really sort through. I think we're abandoning efforts on this one, so no
new Gryo 3.1 thankfully.

On Wed, Nov 28, 2018 at 9:49 AM Stephen Mallette 
wrote:

> kuppitz and i have been batting this one back and forth for a while now.
> it's gotten crazy and ultimately we were backed into a corner with no good
> solutions. the least of all evils involved creating Gryo 3.1 so that spark
> tests could pass with gryo given the revision to property ids being locally
> unique. So basically, Spark will always want to use Gryo 3.1 on 3.4.0+ or
> else you will get weirdness for traversals like:
>
> g.V().hasLabel("person").properties().dedup().value()
>
> For network serialization with Gryo you can stay on gryo 3.0/1.0 and not
> see any problem which means that 3.4.0 remains compatible with 3.3.x on
> those gryo versions. On 3.4.0 we will default to 3.1, but keep 3.0
> configurations in default Gremlin Server.
>
> Gryo seemed so straightforward a solution for us in those early TP3
> days...ends up being just a little faster than GraphSON and only
> useful on the JVM when we're flooded with GLVs. dah
>
> On Mon, Nov 19, 2018 at 9:33 AM Stephen Mallette 
> wrote:
>
>> Reading back through things againI wrote on the issue a long
>> while back that this was going to be a change that just involved writing
>> tests to enforce this "uniqueness concept", but this change is happening at
>> a more foundational level it seems. Any reason to not just go that route
>> somehow? If we really wanted to change this at a foundational level we'd
>> need to stamp a new version of Gryo out (should never have used the default
>> serializer for "detached"), but I don't think we want a third version of
>> Gryo for 3.4.0 (if ever, considering that Jorge has a better solution
>> hopefully coming along).
>>
>> On Wed, Nov 14, 2018 at 10:18 AM Daniel Kuppitz  wrote:
>>
>>> Spark was the very first thing that failed - I think it was only the
>>> newly
>>> added test
>>> <
>>> https://github.com/apache/tinkerpop/pull/993/files#diff-7b625ebb74c59ffe5521b63a49797b64R169
>>> >,
>>> but of course, the failure made sense as this query can't work if the
>>> parent element's id lost somewhere down the line.
>>> So, in one way or another, we have to remember the parent id if we want
>>> to
>>> allow local property id uniqueness.
>>>
>>> Cheers,
>>> Daniel
>>>
>>>
>>> On Tue, Nov 13, 2018 at 7:50 PM Stephen Mallette 
>>> wrote:
>>>
>>> > Gryo is so completely inflexible some times :|
>>> >
>>> > How did we end up having to do this again? Reading back through the
>>> thread,
>>> > it seemed like it was because of Spark testing. What kinds of tests
>>> were
>>> > failing for Spark because of this? I assume it wasn't just property
>>> > assertions in the tests themselves, right? Or was Spark just blowing
>>> up and
>>> > erroring out?
>>> >
>>> > On Tue, Nov 13, 2018 at 10:10 AM Daniel Kuppitz 
>>> wrote:
>>> >
>>> > > That was my conclusion as well, I didn't see a way to make it
>>> > non-breaking.
>>> > > Maybe 3.4.0 should just have a special serializer for
>>> VertexProperties..?
>>> > >
>>> > >
>>> > > On Tue, Nov 13, 2018 at 5:00 AM Stephen Mallette <
>>> spmalle...@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > If i'm thinking about this rightI guess Gryo used the default
>>> Kryo
>>> > > > serialization for 1.0 and 3.0 so  by changing that we effectively
>>> break
>>> > > > both version of Gryo in 3.4.0. And that would mean that you
>>> couldn't
>>> > > > connect to older versions of the Java driver to 3.4.0 systems - you
>>> > would
>>> > > > have to upgrade. Is that the correct conclusion?
>>> > > >
>>> > > > On Mon, Nov 12, 2018 at 9:28 AM Daniel Kuppitz 
>>> > wrote:
>>> > > >
>>> > > > > I actually opened a PR <
>>> https://github.com/apache/tinkerpop/pull/993
>>> > >
>>> > > > > (makes it easier to look at the changes). Removing transient
>>> didn't
>>> > > quite
>>> > > > > do the job, so I overwrote the (de)serialization methods and only
>>> > kept
>>> > > > the
>>> > > > > parent element id. The only thing that's broken now is the
>>> > > serialization
>>> > > > > test suite.
>>> > > > >
>>> > > > > Cheers,
>>> > > > > Daniel
>>> > > > >
>>> > > > >
>>> > > > > On Mon, Nov 12, 2018 at 4:59 AM Stephen Mallette <
>>> > spmalle...@gmail.com
>>> > > >
>>> > > > > wrote:
>>> > > > >
>>> > > > > > I don't think we can remove the transient keyword. Doesn't that
>>> > > balloon
>>> > > > > the
>>> > > > > > amount of data shuffled around the network? I think that was
>>> the
>>> > > reason
>>> > > > > we
>>> > > > >