[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17726425#comment-17726425 ] ASF GitHub Bot commented on TINKERPOP-2941: --- vkagamlyk merged PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053 > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17726424#comment-17726424 ] ASF GitHub Bot commented on TINKERPOP-2941: --- vkagamlyk commented on PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#issuecomment-1563641903 VOTE+1 > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17725596#comment-17725596 ] ASF GitHub Bot commented on TINKERPOP-2941: --- kenhuuu commented on PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#issuecomment-1560258591 > I have no idea for which CHANGELOG entry I should add. My working branch is 3.5-dev and I don't know which release will include this patch. The CHANGELOGs for each subsequent branch state that they include the items from the previous branches. Since you are merging into 3.5-dev, you can just add a short entry to the upcoming release for 3.5. Just place it here https://github.com/apache/tinkerpop/blob/3.5-dev/CHANGELOG.asciidoc#L27 > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724472#comment-17724472 ] ASF GitHub Bot commented on TINKERPOP-2941: --- ministat commented on PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#issuecomment-1555433075 >Could you please also add a CHANGELOG entry? I have no idea for which CHANGELOG entry I should add. My working branch is 3.5-dev and I don't know which release will include this patch. > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724471#comment-17724471 ] ASF GitHub Bot commented on TINKERPOP-2941: --- ministat commented on code in PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#discussion_r1199532084 ## hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java: ## @@ -73,6 +73,7 @@ private Constants() { public static final String GREMLIN_SPARK_PERSIST_STORAGE_LEVEL = "gremlin.spark.persistStorageLevel"; public static final String GREMLIN_SPARK_SKIP_PARTITIONER = "gremlin.spark.skipPartitioner"; // don't partition the loadedGraphRDD public static final String GREMLIN_SPARK_SKIP_GRAPH_CACHE = "gremlin.spark.skipGraphCache"; // don't cache the loadedGraphRDD (ignores graphStorageLevel) +public static final String GREMLIN_SPARK_DONT_DELETE_NONE_EMPTY_OUTPUT = "gremlin.spark.dontDeleteNonEmptyOutput"; // don't delete the output if it is not empty Review Comment: Ack > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724399#comment-17724399 ] ASF GitHub Bot commented on TINKERPOP-2941: --- kenhuuu commented on PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#issuecomment-1555194730 Could you please also add a CHANGELOG entry? VOTE +1, pending minor fixes. > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724398#comment-17724398 ] ASF GitHub Bot commented on TINKERPOP-2941: --- kenhuuu commented on code in PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#discussion_r1199338778 ## hadoop-gremlin/src/main/java/org/apache/tinkerpop/gremlin/hadoop/Constants.java: ## @@ -73,6 +73,7 @@ private Constants() { public static final String GREMLIN_SPARK_PERSIST_STORAGE_LEVEL = "gremlin.spark.persistStorageLevel"; public static final String GREMLIN_SPARK_SKIP_PARTITIONER = "gremlin.spark.skipPartitioner"; // don't partition the loadedGraphRDD public static final String GREMLIN_SPARK_SKIP_GRAPH_CACHE = "gremlin.spark.skipGraphCache"; // don't cache the loadedGraphRDD (ignores graphStorageLevel) +public static final String GREMLIN_SPARK_DONT_DELETE_NONE_EMPTY_OUTPUT = "gremlin.spark.dontDeleteNonEmptyOutput"; // don't delete the output if it is not empty Review Comment: Nit: I think there is a typo here. It should probably be GREMLIN_SPARK_DONT_DELETE_NON_EMPTY_OUTPUT > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719155#comment-17719155 ] ASF GitHub Bot commented on TINKERPOP-2941: --- codecov-commenter commented on PR #2053: URL: https://github.com/apache/tinkerpop/pull/2053#issuecomment-1534112691 ## [Codecov](https://app.codecov.io/gh/apache/tinkerpop/pull/2053?src=pr=h1_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) Report > Merging [#2053](https://app.codecov.io/gh/apache/tinkerpop/pull/2053?src=pr=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (301095f) into [3.5-dev](https://app.codecov.io/gh/apache/tinkerpop/commit/3a9ca15e43b1b215f59a9f4c35a34e2a8caab6f2?el=desc_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) (3a9ca15) will **decrease** coverage by `5.18%`. > The diff coverage is `n/a`. ```diff @@ Coverage Diff @@ ## 3.5-dev#2053 +/- ## = - Coverage 69.42% 64.24% -5.18% = Files866 25 -841 Lines 41251 3759 -37492 Branches54420-5442 = - Hits 28637 2415 -26222 + Misses 10708 1166-9542 + Partials1906 178-1728 ``` [see 841 files with indirect coverage changes](https://app.codecov.io/gh/apache/tinkerpop/pull/2053/indirect-changes?src=pr=tree-more_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral_source=github_content=comment_campaign=pr+comments_term=The+Apache+Software+Foundation) > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (TINKERPOP-2941) DO NOT purge the output location if it has content in SparkGraphComputer
[ https://issues.apache.org/jira/browse/TINKERPOP-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17719151#comment-17719151 ] ASF GitHub Bot commented on TINKERPOP-2941: --- ministat opened a new pull request, #2053: URL: https://github.com/apache/tinkerpop/pull/2053 In production environment, it is dangerous to delete a folder which is not empty. In order to avoid any wrong operations, it is better ask user to delete it manually if it is not empty. > DO NOT purge the output location if it has content in SparkGraphComputer > > > Key: TINKERPOP-2941 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2941 > Project: TinkerPop > Issue Type: Improvement >Affects Versions: 3.6.2 >Reporter: Redriver >Priority: Major > > The default logic for SparkGraphComputer is to purge all content if the > output location is specified. That is a dangerous operation especially for > the output location which contains contents. > https://github.com/apache/tinkerpop/blob/master/spark-gremlin/src/main/java/org/apache/tinkerpop/gremlin/spark/process/computer/SparkGraphComputer.java#L317:L324 > The correct behavior is to stop the process if it detects the output location > is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)