[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14904016#comment-14904016 ] Prasanth Jayachandran commented on HIVE-11319: -- Committed this patch to branch-1 as well. > CTAS with location qualifier overwrites directories > --- > > Key: HIVE-11319 > URL: https://issues.apache.org/jira/browse/HIVE-11319 > Project: Hive > Issue Type: Bug > Components: Parser, Security >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Labels: backward-incompatible > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch > > > CTAS with location clause acts as an insert overwrite. This can cause > problems when there sub directories with in a directory. > This cause some users accidentally wipe out directories with very important > data. We should ban CTAS with location to a non-empty directory. > Reproduce: > create table ctas1 > location '/Users/ychen/tmp' > as > select * from jsmall limit 10; > create table ctas2 > location '/Users/ychen/tmp' > as > select * from jsmall limit 5; > Both creates will succeed. But value in table ctas1 will be replaced by ctas2 > accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905138#comment-14905138 ] Prasanth Jayachandran commented on HIVE-11319: -- [~szehon]/[~ychena] Looks like an incompat change. Is this intended for branch-1 also. If not I can revert the patch. > CTAS with location qualifier overwrites directories > --- > > Key: HIVE-11319 > URL: https://issues.apache.org/jira/browse/HIVE-11319 > Project: Hive > Issue Type: Bug > Components: Parser, Security >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Labels: backward-incompatible > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch > > > CTAS with location clause acts as an insert overwrite. This can cause > problems when there sub directories with in a directory. > This cause some users accidentally wipe out directories with very important > data. We should ban CTAS with location to a non-empty directory. > Reproduce: > create table ctas1 > location '/Users/ychen/tmp' > as > select * from jsmall limit 10; > create table ctas2 > location '/Users/ychen/tmp' > as > select * from jsmall limit 5; > Both creates will succeed. But value in table ctas1 will be replaced by ctas2 > accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905219#comment-14905219 ] Szehon Ho commented on HIVE-11319: -- I think its backward incompatible in the sense that there's now an exception when trying to CTAS over a location already with data so I had marked it as such, but I am not sure the number of people relying on this behavior. I would say its branch-2 behavior, unless people think its not so significant for a minor release. > CTAS with location qualifier overwrites directories > --- > > Key: HIVE-11319 > URL: https://issues.apache.org/jira/browse/HIVE-11319 > Project: Hive > Issue Type: Bug > Components: Parser, Security >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Labels: backward-incompatible > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch > > > CTAS with location clause acts as an insert overwrite. This can cause > problems when there sub directories with in a directory. > This cause some users accidentally wipe out directories with very important > data. We should ban CTAS with location to a non-empty directory. > Reproduce: > create table ctas1 > location '/Users/ychen/tmp' > as > select * from jsmall limit 10; > create table ctas2 > location '/Users/ychen/tmp' > as > select * from jsmall limit 5; > Both creates will succeed. But value in table ctas1 will be replaced by ctas2 > accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905717#comment-14905717 ] Lefty Leverenz commented on HIVE-11319: --- If this gets reverted from branch-1, the TODOC1.3 label should be removed. But for now, let's keep both TODOC labels. > CTAS with location qualifier overwrites directories > --- > > Key: HIVE-11319 > URL: https://issues.apache.org/jira/browse/HIVE-11319 > Project: Hive > Issue Type: Bug > Components: Parser, Security >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Labels: TODOC1.3, TODOC2.0, backward-incompatible > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch > > > CTAS with location clause acts as an insert overwrite. This can cause > problems when there sub directories with in a directory. > This cause some users accidentally wipe out directories with very important > data. We should ban CTAS with location to a non-empty directory. > Reproduce: > create table ctas1 > location '/Users/ychen/tmp' > as > select * from jsmall limit 10; > create table ctas2 > location '/Users/ychen/tmp' > as > select * from jsmall limit 5; > Both creates will succeed. But value in table ctas1 will be replaced by ctas2 > accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14905598#comment-14905598 ] Prasanth Jayachandran commented on HIVE-11319: -- Sorry. That was unintended change. Just noticing it. > CTAS with location qualifier overwrites directories > --- > > Key: HIVE-11319 > URL: https://issues.apache.org/jira/browse/HIVE-11319 > Project: Hive > Issue Type: Bug > Components: Parser, Security >Affects Versions: 0.14.0, 1.0.0, 1.2.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Labels: TODOC1.3, TODOC2.0, backward-incompatible > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch > > > CTAS with location clause acts as an insert overwrite. This can cause > problems when there sub directories with in a directory. > This cause some users accidentally wipe out directories with very important > data. We should ban CTAS with location to a non-empty directory. > Reproduce: > create table ctas1 > location '/Users/ychen/tmp' > as > select * from jsmall limit 10; > create table ctas2 > location '/Users/ychen/tmp' > as > select * from jsmall limit 5; > Both creates will succeed. But value in table ctas1 will be replaced by ctas2 > accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659251#comment-14659251 ] Yongzhi Chen commented on HIVE-11319: - Thanks [~szehon] for reviewing it. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser, Security Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Labels: backward-incompatible Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653462#comment-14653462 ] Yongzhi Chen commented on HIVE-11319: - The two failures are not related (They has failed 34 times). [~szehon], [~spena], could you review the change? Thanks CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653364#comment-14653364 ] Hive QA commented on HIVE-11319: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748529/HIVE-11319.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9320 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4814/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4814/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4814/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12748529 - PreCommit-HIVE-TRUNK-Build CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654091#comment-14654091 ] Szehon Ho commented on HIVE-11319: -- It does look like a vulnerability to me. When the impersonation is on at least you can do file permissions to prevent it, but when impersonation is off it will be an issue. +1 from my side, will let others take a look too. Also this would be a backward incompatible change so we should mark it as such. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Labels: backward-incompatible Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652129#comment-14652129 ] Hive QA commented on HIVE-11319: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748452/HIVE-11319.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4802/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4802/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: java.io.IOException: Could not create /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4802/succeeded/TestHCatHiveCompatibility {noformat} This message is automatically generated. ATTACHMENT ID: 12748452 - PreCommit-HIVE-TRUNK-Build CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651876#comment-14651876 ] Yongzhi Chen commented on HIVE-11319: - Build machine out of disk? Reattach second patch. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652423#comment-14652423 ] Yongzhi Chen commented on HIVE-11319: - Tests did not run. Re-attach. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651405#comment-14651405 ] Hive QA commented on HIVE-11319: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748382/HIVE-11319.2.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4794/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4794/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4794/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Tests exited with: ExecutionException: org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult [localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore, remoteFile=/home/hiveptest/54.158.176.48-hiveptest-0/logs/, getExitCode()=12, getException()=null, getUser()=hiveptest, getHost()=54.158.176.48, getInstance()=0]: 'Address 54.158.176.48 maps to ec2-54-158-176-48.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list TEST-TestEmbeddedHiveMetaStore-TEST-org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.xml 0 0%0.00kB/s0:00:00 8612 100%8.21MB/s0:00:00 (xfer#1, to-check=3/5) hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (220 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.158.176.48 maps to ec2-54-158-176-48.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (220 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.158.176.48 maps to ec2-54-158-176-48.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (220 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.158.176.48 maps to ec2-54-158-176-48.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (220 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] Address 54.158.176.48 maps to ec2-54-158-176-48.compute-1.amazonaws.com, but this does not map back to the address - POSSIBLE BREAK-IN ATTEMPT! receiving incremental file list ./ hive.log 0 0%0.00kB/s0:00:00 rsync: write failed on /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-4794/succeeded/TestEmbeddedHiveMetaStore/hive.log: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(301) [receiver=3.0.6] rsync: connection unexpectedly closed (220 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c(600) [generator=3.0.6] ' {noformat} This message is automatically generated. ATTACHMENT ID: 12748382 - PreCommit-HIVE-TRUNK-Build CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650719#comment-14650719 ] Yongzhi Chen commented on HIVE-11319: - Throws error when the location points to a non-empty folder. CTAS only create internal tables which means hive will be in full control of the folder related to the table. CTAS throws error when try to create a existing table, we should treat the non-empty folder same way. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650724#comment-14650724 ] Hive QA commented on HIVE-11319: {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12748354/HIVE-11319.1.patch Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4791/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4791/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4791/ Messages: {noformat} This message was trimmed, see log for full details [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ spark-client --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [copy] Copying 11 files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ spark-client --- [INFO] Compiling 5 source files to /data/hive-ptest/working/apache-github-source-source/spark-client/target/test-classes [INFO] [INFO] --- maven-dependency-plugin:2.8:copy (copy-guava-14) @ spark-client --- [INFO] Configured Artifact: com.google.guava:guava:14.0.1:jar [INFO] Copying guava-14.0.1.jar to /data/hive-ptest/working/apache-github-source-source/spark-client/target/dependency/guava-14.0.1.jar [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ spark-client --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ spark-client --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ spark-client --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ spark-client --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/target/spark-client-2.0.0-SNAPSHOT.jar to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/spark-client/pom.xml to /home/hiveptest/.m2/repository/org/apache/hive/spark-client/2.0.0-SNAPSHOT/spark-client-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Query Language 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-exec --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/ql (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-exec --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (generate-sources) @ hive-exec --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-test-sources/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen Generating vector expression code Generating vector expression test code [INFO] Executed tasks [INFO] [INFO] --- build-helper-maven-plugin:1.8:add-source (add-source) @ hive-exec --- [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/protobuf/gen-java added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/src/gen/thrift/gen-javabean added. [INFO] Source directory: /data/hive-ptest/working/apache-github-source-source/ql/target/generated-sources/java added. [INFO] [INFO] --- antlr3-maven-plugin:3.4:antlr (default) @ hive-exec --- [INFO] ANTLR: Processing source directory /data/hive-ptest/working/apache-github-source-source/ql/src/java ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/ql/parse/HiveLexer.g
[jira] [Commented] (HIVE-11319) CTAS with location qualifier overwrites directories
[ https://issues.apache.org/jira/browse/HIVE-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651340#comment-14651340 ] Yongzhi Chen commented on HIVE-11319: - Build failure because the patch uses a new method(isDirectory) which is not supported. Change to old way (isDir). Attach second patch. CTAS with location qualifier overwrites directories --- Key: HIVE-11319 URL: https://issues.apache.org/jira/browse/HIVE-11319 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.14.0, 1.0.0, 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Attachments: HIVE-11319.1.patch, HIVE-11319.2.patch CTAS with location clause acts as an insert overwrite. This can cause problems when there sub directories with in a directory. This cause some users accidentally wipe out directories with very important data. We should ban CTAS with location to a non-empty directory. Reproduce: create table ctas1 location '/Users/ychen/tmp' as select * from jsmall limit 10; create table ctas2 location '/Users/ychen/tmp' as select * from jsmall limit 5; Both creates will succeed. But value in table ctas1 will be replaced by ctas2 accidentally. -- This message was sent by Atlassian JIRA (v6.3.4#6332)