[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...
Github user metlos commented on the issue: https://github.com/apache/tinkerpop/pull/518 Created #519 with the updates as suggested here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #519: Make ElementHelper.areEqual(Property, Object) h...
GitHub user metlos opened a pull request: https://github.com/apache/tinkerpop/pull/519 Make ElementHelper.areEqual(Property, Object) handle nulls⦠⦠so that it can be used correctly in equals() methods of Property impls. Added test methods for additional equality "scenarios" in ElementHelper. Note that it is not possible to add tests for equals on `Vertex`, `Edge`, `Property` and `VertexProperty` directly as they are interfaces, not concrete classes. Instead I've added additional test methods for `ElementHelper` to test various case of properties and elements. `ElementHelper.areEqual()` methods are suggested to be used for `equals` of implementations of those interfaces. You can merge this pull request into a Git repository by running: $ git pull https://github.com/metlos/tinkerpop null-equality-on-props Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #519 commit 366a0b473e09ccdb4e49803d1ab804f82faf32e2 Author: Lukas KrejciDate: 2016-12-21T22:02:03Z Make ElementHelper.areEqual(Property, Object) handle nulls so that it can be used correctly in equals() methods of Property impls. Added test methods for additional equality "scenarios" in ElementHelper. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1585) OLAP dedup over non elements
[ https://issues.apache.org/jira/browse/TINKERPOP-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767603#comment-15767603 ] Daniel Kuppitz commented on TINKERPOP-1585: --- Quick performance test / comparison over TinkerGraph: {noformat} graph = TinkerGraph.open() g = graph.traversal() a = graph.traversal().withComputer() r = new Random(123) (1..100).each { def vid = ["a","b","c","d"].collectEntries {[it, r.nextInt() % 40]} graph.addVertex(id, vid) }; [] clockWithResult(1) {g.V().id().select("c").count().next()} clockWithResult(1) {g.V().id().select("c").dedup().count().next()} clockWithResult(1) {a.V().id().select("c").count().next()} clockWithResult(1) {a.V().id().select("c").dedup().count().next()} {noformat} {noformat} gremlin> clockWithResult(1) {g.V().id().select("c").count().next()} ==>22.258808 ==>100 gremlin> clockWithResult(1) {g.V().id().select("c").dedup().count().next()} ==>727.913942 ==>570723 gremlin> clockWithResult(1) {a.V().id().select("c").count().next()} ==>23448.141182 ==>100 gremlin> clockWithResult(1) {a.V().id().select("c").dedup().count().next()} ==>31519.832272 ==>570723 {noformat} Spark is a lot faster with the no-dedup traversal, but it probably takes advantage of multiple parallel threads. (On my local machine) Spark (w/o TinkerPop) needs about 600 ms to count the number of Strings in a list of 1M items and approx. 5 seconds to count the number of distinct items. Maybe a Spark interceptor is the way to go. > OLAP dedup over non elements > > > Key: TINKERPOP-1585 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1585 > Project: TinkerPop > Issue Type: Bug > Components: hadoop, process >Affects Versions: 3.2.3 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > OLAP {{dedup()}} is highly inefficient when it's fed with non elements. > In a customer project a query similar tho the following returned a result in > slightly more than 6 seconds: > {noformat} > persistedRDD. > V().hasLabel("label1","label2"). > inE("edgeLabel1","edgeLabel2").outV(). > id().count() > {noformat} > The same query with {{dedup()}} added: > {noformat} > persistedRDD. > V().hasLabel("label1","label2"). > inE("edgeLabel1","edgeLabel2").outV(). > id().dedup().count() > {noformat} > ...took more than 120 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...
Github user metlos commented on the issue: https://github.com/apache/tinkerpop/pull/518 Thanks for the feedback. I'm closing this PR then and will target the reworked patch to tp32. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #518: Honor the equals() contract on Property impls b...
Github user metlos closed the pull request at: https://github.com/apache/tinkerpop/pull/518 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/518 Thanks @okram @metlos please also target the tp32 branch instead of master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (TINKERPOP-1585) OLAP dedup over non elements
[ https://issues.apache.org/jira/browse/TINKERPOP-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15767360#comment-15767360 ] Daniel Kuppitz commented on TINKERPOP-1585: --- To have rough estimate of how long it should take, I added 1.725.403 Strings to a Set (459.966 unique values). These are numbers from the actual customer project. {noformat} gremlin> clockWithResult(1) { s = [] as Set; for (i = 0; i < 1725403; i++) { s << (i%459966).toString()}; s.size() } ==>1396.075091 ==>459966 {noformat} So it doesn't take much more than 1 second to deduplicate 1.7M Strings. I think we can also ignore network limitations, since we're not talking about lots of data. > OLAP dedup over non elements > > > Key: TINKERPOP-1585 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1585 > Project: TinkerPop > Issue Type: Bug > Components: hadoop, process >Affects Versions: 3.2.3 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > OLAP {{dedup()}} is highly inefficient when it's fed with non elements. > In a customer project a query similar tho the following returned a result in > slightly more than 6 seconds: > {noformat} > persistedRDD. > V().hasLabel("label1","label2"). > inE("edgeLabel1","edgeLabel2").outV(). > id().count() > {noformat} > The same query with {{dedup()}} added: > {noformat} > persistedRDD. > V().hasLabel("label1","label2"). > inE("edgeLabel1","edgeLabel2").outV(). > id().dedup().count() > {noformat} > ...took more than 120 seconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/518 Yea, I would put the null check into `ElementHelper.areEqual()` so all providers have the same semantics. Then I would add a structure test to verifies that null is okay for `equals()` for every element type: vertex, edge, property, vertex property. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop issue #518: Honor the equals() contract on Property impls by allow...
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/518 hmmm - i'm not sure why `ElementHelper.areEqual(Property, Object)` needs to throw exceptions for null situations. i'm not so sure that's really how it should work. In studying some of the usage, I don't see where that's really valuable logic. I think the better fix is to just to make `ElementHelper.areEqual(Property, Object)` handle nulls properly like the other `areEqual()` methods. @okram any problems taking that approach? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] tinkerpop pull request #518: Honor the equals() contract on Property impls b...
GitHub user metlos opened a pull request: https://github.com/apache/tinkerpop/pull/518 Honor the equals() contract on Property impls by allowing nulls as valid comparisons. If user code put the properties into some kind of collection together with `null` values, things would blow up. Note that `ElementHelper.areEqual(Element, Object)` allows for `null` values, but `ElementHelper.areEqual(Property, Object)` does not. I assume the reason for this discrepancy is that properties are never supposed to be null but merely not present. While this assumption is valid in Tinkerpop impl, I think it cannot be imposed on the user code, which assumes a valid implementation of `equals()` which *should not* blow up on null arguments. You can merge this pull request into a Git repository by running: $ git pull https://github.com/metlos/tinkerpop null-equality-on-props Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tinkerpop/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit a0b604c10f3ffbf96fbe444bc5059e2c9d0030a3 Author: Lukas KrejciDate: 2016-12-21T10:34:34Z Honor the equals() contract on Property impls by allowing nulls as valid comparisons. If user code put the properties into some kind of collection together with null values, things would blow up. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Open graph effort at the Linux Foundation
It's so great, thanks Jason 在 2016年12月9日星期五 UTC+8上午3:24:33,Jason Plurad写道: > > Many folks in the Titan community have continued to reach out wondering > how to continue development on an Apache-licensed, open source, and > scalable graph database with pluggable backends. I want to let you know > that the Linux Foundation is establishing an open community graph project, > including developers from various backend providers, to fulfill that need. > The logistics for this new home are being finalized, and it will carry on > the open source heritage of Titan with open governance. The Apache license > will be maintained, and the community will operate along the same > principles of an Apache project. Once naming the new project is complete, > all are welcome to join, contribute, and drive forward this scalable graph > solution. > > -- Jason >