[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/17858
  
@gatorsmile it seems my mistake, i will try to fix this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/17858
  
@gatorsmile My production environment is spark 2.0.2 and test successful. 
Is there something be changed since 2.0.2 for this case? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17858
  
**[Test build #76566 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76566/testReport)**
 for PR 17858 at commit 
[`de938ed`](https://github.com/apache/spark/commit/de938ed91f6e62b02fed32c7123f3aae5d51d9f7).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17858
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/76566/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17858
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/17858
  
**[Test build #76566 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/76566/testReport)**
 for PR 17858 at commit 
[`de938ed`](https://github.com/apache/spark/commit/de938ed91f6e62b02fed32c7123f3aae5d51d9f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17858
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17858
  
This will not pass the test cases, because we only deleted the child 
directory `.hive-staging` of the `stagingDir`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-08 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/17858
  
yes i tried the same thing in Hive, got the same error:
`2017-05-08T13:48:04,634 ERROR exec.Task (:()) - Failed with exception 
Unable to move source 
hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1
 to destination hdfs://nameservice/hive/test_table1
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to move source 
hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1
 to destination hdfs://nameservice/hive/test_table1
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2959)
at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:3198)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1805)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:355)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1917)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1586)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1331)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1092)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1080)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.FileNotFoundException: File 
hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1
 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:697)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1525)
at org.apache.hadoop.hive.ql.metadata.Hive.moveFile(Hive.java:2896)
... 22 more

2017-05-08T13:48:04,635 ERROR ql.Driver (:()) - FAILED: Execution Error, 
return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. Unable to move 
source 
hdfs://nameservice/hive/test_table1/test_hive_2017-05-08_13-47-40_660_5235248825413690559-1/-ext-1
 to destination hdfs://nameservice/hive/test_table1`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-05 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/17858
  
if this is a hive bug, this patch seems a valid workaround for Spark SQL.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-04 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/17858
  
This sounds a bug in Hive metastore. Could you try the same thing in Hive? 
Do you hit the same error? Let us see how Hive behaves and then we can decide 
what is the best way to handle it. Thanks!

BTW, you need to create a test case. For example, 
`InsertIntoHiveTableSuite.scala`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-04 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/17858
  
Do you really need to force this? or, is it just that any path relative to 
the output dir has to be a hidden directory starting with "." or "_"? For 
example, right now this prevents me from making the staging dir "/foo/bar" but 
I don't see a reason to disallow that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-04 Thread zuotingbing
Github user zuotingbing commented on the issue:

https://github.com/apache/spark/pull/17858
  
In this case, Hive will create the staging directory under the table 
directory, and when moving staging directory to table directory, Hive will 
still empty the table directory, but will exclude the staging directory which 
start with "." or "_"

`public static final PathFilter HIDDEN_FILES_PATH_FILTER = new PathFilter() 
{
public boolean accept(Path p) {
  String name = p.getName();
  return !name.startsWith("_") && !name.startsWith(".");
}
  };`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #17858: [SPARK-20594][SQL]The staging directory should be append...

2017-05-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/17858
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org