RE: More than one table created at the same location

2016-08-31 Thread Santlal J Gupta
J Gupta -Original Message- From: naveen mahadevuni [mailto:nmahadev...@gmail.com] Sent: Wednesday, August 31, 2016 7:42 PM To: dev@hive.apache.org Subject: Re: More than one table created at the same location Hi, I created external table, copied data files to that location and then cou

Re: More than one table created at the same location

2016-08-31 Thread naveen mahadevuni
Hi, I created external table, copied data files to that location and then count returns 4. It is ambiguous, can it be documented? hive> CREATE EXTERNAL TABLE test_ext (col1 INT, col2 INT) > stored as orc > LOCATION '/apps/hive/warehouse/ext'; OK Time taken: 9.875 seconds hive> select

Re: More than one table created at the same location

2016-08-30 Thread Thejas Nair
Naveen, Can you please verify if you create these tables as external tables the results are correct ? In case of managed tables, the assumption is that there is a 1:1 mapping between tables and the locations and all update to the table are through hive. With that assumption, it relies on stats to

Re: More than one table created at the same location

2016-08-30 Thread Abhishek Somani
For the 2nd table(after both inserts are over), isn't the return count expected to be 4? In that case, isn't the the bug that the count was returned wrong(maybe from the stats as mentioned) rather the fact that another table was allowed to be created at the same location? I might be very wrong,

Re: More than one table created at the same location

2016-08-29 Thread Alan Gates
Note that Hive doesn’t track individual files, just which directory a table stores its files in. So we wouldn’t expect this to work. The bug is more that Hive doesn’t detect that two tables are trying to use the same directory. I’m not sure we’re anxious to fix this since it would mean when

Re: More than one table created at the same location

2016-08-29 Thread Sergey Shelukhin
This is a bug, or rather an unexpected usage. I suspect the correct count value is coming from statistics. Can you file a JIRA? On 16/8/29, 00:51, "naveen mahadevuni" wrote: >Hi, > >Is the following behavior a bug? I believe at least one part of it is a >bug. I created

More than one table created at the same location

2016-08-29 Thread naveen mahadevuni
Hi, Is the following behavior a bug? I believe at least one part of it is a bug. I created two Hive tables at the same location and inserted rows in two tables. count(*) returns the correct count for each individual table, but SELECT * on one tables reads the rows from other table files too.