[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...

2016-12-21 Thread metlos
Github user metlos commented on the issue:

https://github.com/apache/tinkerpop/pull/518
  
Created #519 with the updates as suggested here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #519: Make ElementHelper.areEqual(Property, Object) h...

2016-12-21 Thread metlos
GitHub user metlos opened a pull request:

https://github.com/apache/tinkerpop/pull/519

Make ElementHelper.areEqual(Property, Object) handle nulls…

… so that it can be used correctly in equals() methods of Property impls.

Added test methods for additional equality "scenarios" in ElementHelper.

Note that it is not possible to add tests for equals on `Vertex`, `Edge`, 
`Property` and `VertexProperty` directly as they are interfaces, not concrete 
classes. Instead I've added additional test methods for `ElementHelper` to test 
various case of properties and elements. `ElementHelper.areEqual()` methods are 
suggested to be used for `equals` of implementations of those interfaces.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/metlos/tinkerpop null-equality-on-props

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #519


commit 366a0b473e09ccdb4e49803d1ab804f82faf32e2
Author: Lukas Krejci 
Date:   2016-12-21T22:02:03Z

Make ElementHelper.areEqual(Property, Object) handle nulls so that it can
be used correctly in equals() methods of Property impls.

Added test methods for additional equality "scenarios" in ElementHelper.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (TINKERPOP-1585) OLAP dedup over non elements

2016-12-21 Thread Daniel Kuppitz (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767603#comment-15767603
 ] 

Daniel Kuppitz commented on TINKERPOP-1585:
---

Quick  performance test / comparison over TinkerGraph:

{noformat}
graph = TinkerGraph.open()
g = graph.traversal()
a = graph.traversal().withComputer()
r = new Random(123)

(1..100).each {
  def vid = ["a","b","c","d"].collectEntries {[it, r.nextInt() % 40]}
  graph.addVertex(id, vid)
}; []

clockWithResult(1) {g.V().id().select("c").count().next()}
clockWithResult(1) {g.V().id().select("c").dedup().count().next()}
clockWithResult(1) {a.V().id().select("c").count().next()}
clockWithResult(1) {a.V().id().select("c").dedup().count().next()}
{noformat}

{noformat}
gremlin> clockWithResult(1) {g.V().id().select("c").count().next()}
==>22.258808
==>100
gremlin> clockWithResult(1) {g.V().id().select("c").dedup().count().next()}
==>727.913942
==>570723
gremlin> clockWithResult(1) {a.V().id().select("c").count().next()}
==>23448.141182
==>100
gremlin> clockWithResult(1) {a.V().id().select("c").dedup().count().next()}
==>31519.832272
==>570723
{noformat}

Spark is a lot faster with the no-dedup traversal, but it probably takes 
advantage of multiple parallel threads.
(On my local machine) Spark (w/o TinkerPop) needs about 600 ms to count the 
number of Strings in a list of 1M items and approx. 5 seconds to count the 
number of distinct items. Maybe a Spark interceptor is the way to go.

> OLAP dedup over non elements
> 
>
> Key: TINKERPOP-1585
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1585
> Project: TinkerPop
>  Issue Type: Bug
>  Components: hadoop, process
>Affects Versions: 3.2.3
>Reporter: Daniel Kuppitz
>Assignee: Marko A. Rodriguez
>
> OLAP {{dedup()}} is highly inefficient when it's fed with non elements.
> In a customer project a query similar tho the following returned a result in 
> slightly more than 6 seconds:
> {noformat}
> persistedRDD.
>   V().hasLabel("label1","label2").
>   inE("edgeLabel1","edgeLabel2").outV().
>   id().count()
> {noformat}
> The same query with {{dedup()}} added:
> {noformat}
> persistedRDD.
>   V().hasLabel("label1","label2").
>   inE("edgeLabel1","edgeLabel2").outV().
>   id().dedup().count()
> {noformat}
> ...took more than 120 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...

2016-12-21 Thread metlos
Github user metlos commented on the issue:

https://github.com/apache/tinkerpop/pull/518
  
Thanks for the feedback. I'm closing this PR then and will target the 
reworked patch to tp32.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #518: Honor the equals() contract on Property impls b...

2016-12-21 Thread metlos
Github user metlos closed the pull request at:

https://github.com/apache/tinkerpop/pull/518


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...

2016-12-21 Thread spmallette
Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/518
  
Thanks @okram 

@metlos please also target the tp32 branch instead of master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (TINKERPOP-1585) OLAP dedup over non elements

2016-12-21 Thread Daniel Kuppitz (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767360#comment-15767360
 ] 

Daniel Kuppitz commented on TINKERPOP-1585:
---

To have rough estimate of how long it should take, I added 1.725.403 Strings to 
a Set (459.966 unique values). These are numbers from the actual customer 
project.

{noformat}
gremlin> clockWithResult(1) { s = [] as Set; for (i = 0; i < 1725403; i++) { s 
<< (i%459966).toString()}; s.size() }
==>1396.075091
==>459966
{noformat}

So it doesn't take much more than 1 second to deduplicate 1.7M Strings. I think 
we can also ignore network limitations, since we're not talking about lots of 
data.

> OLAP dedup over non elements
> 
>
> Key: TINKERPOP-1585
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1585
> Project: TinkerPop
>  Issue Type: Bug
>  Components: hadoop, process
>Affects Versions: 3.2.3
>Reporter: Daniel Kuppitz
>Assignee: Marko A. Rodriguez
>
> OLAP {{dedup()}} is highly inefficient when it's fed with non elements.
> In a customer project a query similar tho the following returned a result in 
> slightly more than 6 seconds:
> {noformat}
> persistedRDD.
>   V().hasLabel("label1","label2").
>   inE("edgeLabel1","edgeLabel2").outV().
>   id().count()
> {noformat}
> The same query with {{dedup()}} added:
> {noformat}
> persistedRDD.
>   V().hasLabel("label1","label2").
>   inE("edgeLabel1","edgeLabel2").outV().
>   id().dedup().count()
> {noformat}
> ...took more than 120 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...

2016-12-21 Thread okram
Github user okram commented on the issue:

https://github.com/apache/tinkerpop/pull/518
  
Yea, I would put the null check into `ElementHelper.areEqual()` so all 
providers have the same semantics. Then I would add a structure test to 
verifies that null is okay for `equals()` for every element type: vertex, edge, 
property, vertex property.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...

2016-12-21 Thread spmallette
Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/518
  
hmmm - i'm not sure why `ElementHelper.areEqual(Property, Object)` needs to 
throw exceptions for null situations. i'm not so sure that's really how it 
should work. In studying some of the usage, I don't see where that's really 
valuable logic. I think the better fix is to just to make 
`ElementHelper.areEqual(Property, Object)` handle nulls properly like the other 
`areEqual()` methods.  @okram any problems taking that approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] tinkerpop pull request #518: Honor the equals() contract on Property impls b...

2016-12-21 Thread metlos
GitHub user metlos opened a pull request:

https://github.com/apache/tinkerpop/pull/518

Honor the equals() contract on Property impls by allowing nulls as valid 
comparisons.

If user code put the properties into some kind of collection together with 
`null` values, things would blow up.

Note that `ElementHelper.areEqual(Element, Object)` allows for `null` 
values, but `ElementHelper.areEqual(Property, Object)` does not. I assume the 
reason for this discrepancy is that properties are never supposed to be null 
but merely not present.

While this assumption is valid in Tinkerpop impl, I think it cannot be 
imposed on the user code, which assumes a valid implementation of `equals()` 
which *should not* blow up on null arguments.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/metlos/tinkerpop null-equality-on-props

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/tinkerpop/pull/518.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #518


commit a0b604c10f3ffbf96fbe444bc5059e2c9d0030a3
Author: Lukas Krejci 
Date:   2016-12-21T10:34:34Z

Honor the equals() contract on Property impls by allowing nulls as valid
comparisons.

If user code put the properties into some kind of collection together with
null values, things would blow up.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Open graph effort at the Linux Foundation

2016-12-21 Thread 王斌

It's so great, thanks Jason
在 2016年12月9日星期五 UTC+8上午3:24:33,Jason Plurad写道:
>
> Many folks in the Titan community have continued to reach out wondering 
> how to continue development on an Apache-licensed, open source, and 
> scalable graph database with pluggable backends. I want to let you know 
> that the Linux Foundation is establishing an open community graph project, 
> including developers from various backend providers, to fulfill that need. 
> The logistics for this new home are being finalized, and it will carry on 
> the open source heritage of Titan with open governance. The Apache license 
> will be maintained, and the community will operate along the same 
> principles of an Apache project. Once naming the new project is complete, 
> all are welcome to join, contribute, and drive forward this scalable graph 
> solution.
>
> -- Jason
>