[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/17858 @gatorsmile it seems my mistake, i will try to fix this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/17858 @gatorsmile My production environment is spark 2.0.2 and test successful. Is there something be changed since 2.0.2 for this case? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17858 **[Test build #76566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76566/testReport)** for PR 17858 at commit [`de938ed`](https://github.com/apache/spark/commit/de938ed91f6e62b02fed32c7123f3aae5d51d9f7). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17858 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76566/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17858 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17858 **[Test build #76566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76566/testReport)** for PR 17858 at commit [`de938ed`](https://github.com/apache/spark/commit/de938ed91f6e62b02fed32c7123f3aae5d51d9f7). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17858 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17858 This will not pass the test cases, because we only deleted the child directory `.hive-staging` of the `stagingDir` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/17858 yes i tried the same thing in Hive, got the same error: `2017-05-08T13:48:04,634 ERROR exec.Task (:()) - Failed with exception Unable to move source hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1 to destination hdfs://nameservice/hive/test_table1 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1 to destination hdfs://nameservice/hive/test_table1 at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2959) at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1917) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1586) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1331) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.FileNotFoundException: File hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:697) at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105) at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755) at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1525) at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2896) ... 22 more 2017-05-08T13:48:04,635 ERROR ql.Driver (:()) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move source hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1 to destination hdfs://nameservice/hive/test_table1` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/17858 if this is a hive bug, this patch seems a valid workaround for Spark SQL. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17858 This sounds a bug in Hive metastore. Could you try the same thing in Hive? Do you hit the same error? Let us see how Hive behaves and then we can decide what is the best way to handle it. Thanks! BTW, you need to create a test case. For example, `InsertIntoHiveTableSuite.scala`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/17858 Do you really need to force this? or, is it just that any path relative to the output dir has to be a hidden directory starting with "." or "_"? For example, right now this prevents me from making the staging dir "/foo/bar" but I don't see a reason to disallow that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user zuotingbing commented on the issue: https://github.com/apache/spark/pull/17858 In this case, Hive will create the staging directory under the table directory, and when moving staging directory to table directory, Hive will still empty the table directory, but will exclude the staging directory which start with "." or "_" `public static final PathFilter HIDDEN_FILES_PATH_FILTER = new PathFilter() { public boolean accept(Path p) { String name = p.getName(); return !name.startsWith("_") && !name.startsWith("."); } };` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17858 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org