[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

2010-11-15 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932306#action_12932306
 ] 

Namit Jain commented on HIVE-78:


Few minor comments:

1. Can you add more comments in M* files (the new files in the metastore) ?
2. MRoleEntiry needs a database name - so does the thirft file ?
3. Can you verify that create and create table as select works for hive 
replication ?
4. Can you check who adds inputs/outputs for locking operations ?


 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor, Server Infrastructure
Reporter: Ashish Thusoo
Assignee: He Yongqiang
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, 
 HIVE-78.2.nothrift.patch, HIVE-78.2.thrift.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

2010-11-15 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932354#action_12932354
 ] 

Namit Jain commented on HIVE-78:


Driver:
  //do the authorization check
385 if (HiveConf.getBoolVar(conf,
386   HiveConf.ConfVars.HIVE_AUTHORIZATION_ENABLED)) {
387   boolean pass = doAuthorization(sem);
388   if (!pass) {
389 console.printError(Authrizatio\
n failed (not enough privileges found t?
o run the query.).);
390 return (400);
391 }
392   }


Can we print the reason which privilege was missing ?



Can we optimize the scenario - we are checking for all partitions one-by-one
both for inputs and outputs ? What if the user/group/role has the table
privilege - we dont need to go over all the partitions one by one.
We can even do this in a follow-up



Why do we need the change in QueryPlan ?

showGrants: should the output have a schema ? Going forwad, it will
be easier for JDBC clients to parse.

No need to change WriteEntity etc. ?

user cannot be made a reserved word - ~20 tables have a column called 'user'
in facebook - please check 'role' and 'option'.

SemanticAnalyzer: 3511 not needed


What happens to replication of roles - needs to be done


Where are the privileges copied for a newly created partition ?


 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor, Server Infrastructure
Reporter: Ashish Thusoo
Assignee: He Yongqiang
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, 
 HIVE-78.2.nothrift.patch, HIVE-78.2.thrift.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

2010-11-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932666#action_12932666
 ] 

Namit Jain commented on HIVE-1642:
--

hive-default.xml

477 property
478   namehive.mapjoin.hashtable.threshold/name
479   value10/value
480   descriptionthe threshold for the mapjoin hashtable/description
481 /property
482
483 property
484   namehive.mapjoin.hashtable.loadfactor/name
485   value0.75/value
486   descriptionthe load factor for the mapjoin hashtable/description
487 /property
488
489 property
490   namehive.mapjoin.smalltable.filesize/name
491   value2500/value
492   descriptionThe threshold for the input file size of the small 
tables; if the file size is smaller than this threshold, it will try to concert 
the common join into map join/description
493 /property
494
495 property
496   namehive.mapjoin.localtask.max.memory.usage/name
497   value0.90/value
498   descriptionThe max memory usage of the local task for map 
join/description
499 /property
500


Add more comments for the 1,2 and 4 properties.
spelling mistake in the third: concert - convert


Uncheckout DriverContext.java


Why should backup task be obtained from the resolver ?
It can be created at task creation time itself ?


 Convert join queries to map-join based on size of table/row
 ---

 Key: HIVE-1642
 URL: https://issues.apache.org/jira/browse/HIVE-1642
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive_1642_1.patch, hive_1642_2.patch, hive_1642_4.patch


 Based on the number of rows and size of each table, Hive should automatically 
 be able to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1796) dumps time at which lock was taken along with the queryid in show locks T extended

2010-11-16 Thread Namit Jain (JIRA)
dumps time at which lock was taken along with the queryid in show locks T 
extended


 Key: HIVE-1796
 URL: https://issues.apache.org/jira/browse/HIVE-1796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.7.0


It would be useful to dump the time at which the lock was taken for debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

2010-11-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932741#action_12932741
 ] 

Namit Jain commented on HIVE-1642:
--

ConditionalResolverCommonJoin

// generate file size to alias mapping; but connot set file size as key,
// using 2 list to keep mapping

spelling (connot)

 Convert join queries to map-join based on size of table/row
 ---

 Key: HIVE-1642
 URL: https://issues.apache.org/jira/browse/HIVE-1642
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1642_5.patch, hive-1642_6.patch, hive_1642_1.patch, 
 hive_1642_2.patch, hive_1642_4.patch


 Based on the number of rows and size of each table, Hive should automatically 
 be able to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

2010-11-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932744#action_12932744
 ] 

Namit Jain commented on HIVE-1642:
--

ConditionalResolverCommonJoin

  // Iterate the sorted_set to get big/small table file size
  for (int index = 0; index  sortedList.size(); index++) {
Long key = sortedList.get(index);
int i = fileSizeList.indexOf(key);
String alias = aliasList.get(i);

if (index != (size - 1)) {
  smallTablesFileSizeSum += key.longValue();
} else {
  bigTableFileSize += key.longValue();
  bigTableFileAlias = alias;
}
  }


The lines:

int i = fileSizeList.indexOf(key);
String alias = aliasList.get(i);

are only needed in the 'else' block

 Convert join queries to map-join based on size of table/row
 ---

 Key: HIVE-1642
 URL: https://issues.apache.org/jira/browse/HIVE-1642
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1642_5.patch, hive-1642_6.patch, hive_1642_1.patch, 
 hive_1642_2.patch, hive_1642_4.patch


 Based on the number of rows and size of each table, Hive should automatically 
 be able to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1795) outputs not correctly populated for alter table

2010-11-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1795:
-

Attachment: hive.1795.1.patch

 outputs not correctly populated for alter table
 ---

 Key: HIVE-1795
 URL: https://issues.apache.org/jira/browse/HIVE-1795
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.7.0

 Attachments: hive.1795.1.patch


 For any :
 alter table T partition p ...
 The table T is added in the output. It leads to problems with locking, and 
 will lead to problems in future for authorization.
 The partition should be in the output, not the table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row

2010-11-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932843#action_12932843
 ] 

Namit Jain commented on HIVE-1642:
--

+1 running tests

 Convert join queries to map-join based on size of table/row
 ---

 Key: HIVE-1642
 URL: https://issues.apache.org/jira/browse/HIVE-1642
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1642_10.patch, hive-1642_11.patch, 
 hive-1642_5.patch, hive-1642_6.patch, hive-1642_7.patch, hive-1642_9.patch, 
 hive_1642_1.patch, hive_1642_2.patch, hive_1642_4.patch


 Based on the number of rows and size of each table, Hive should automatically 
 be able to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1785) change Pre/Post Query Hooks to take in 1 parameter: HookContext

2010-11-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933044#action_12933044
 ] 

Namit Jain commented on HIVE-1785:
--

Can you regenerate the patch ?

I have already committed HIVE-1642

 change Pre/Post Query Hooks to take in 1 parameter: HookContext
 ---

 Key: HIVE-1785
 URL: https://issues.apache.org/jira/browse/HIVE-1785
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Attachments: hive_1785_1.patch, hive_1785_2.patch


 This way, it would be possible to add new parameters to the hooks without 
 changing the existing hooks.
 This will be a incompatible change, and all the hooks need to change to the 
 new API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1783) CommonJoinOperator optimize the case of 1:1 join

2010-11-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1783:
-

Status: Open  (was: Patch Available)

Can you refresh the patch ? HIVE-1642 has been committed, so this is good to go

 CommonJoinOperator optimize the case of 1:1 join
 

 Key: HIVE-1783
 URL: https://issues.apache.org/jira/browse/HIVE-1783
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-1783.1.patch, HIVE-1783.2.patch


 CommonJoinOperator.genObject() is expensive. It does a recursive and keeps 
 lots of states because it has to:
 1. handle null cases for outer joins
 2. handle the case of duplicated keys from one join party
 We can do a minor optimization to detect a 1:1 join (which is quite common) 
 before calling CommonJoinOperator.genObject() and forward columns in a simple 
 for-loop if we are sure neither of 1 or 2 will happen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1796) dumps time at which lock was taken along with the queryid in show locks T extended

2010-11-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1796:
-

Attachment: hive.1796.1.patch

 dumps time at which lock was taken along with the queryid in show locks T 
 extended
 

 Key: HIVE-1796
 URL: https://issues.apache.org/jira/browse/HIVE-1796
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.7.0

 Attachments: hive.1796.1.patch


 It would be useful to dump the time at which the lock was taken for debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1785) change Pre/Post Query Hooks to take in 1 parameter: HookContext

2010-11-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933062#action_12933062
 ] 

Namit Jain commented on HIVE-1785:
--

+1

running tests

 change Pre/Post Query Hooks to take in 1 parameter: HookContext
 ---

 Key: HIVE-1785
 URL: https://issues.apache.org/jira/browse/HIVE-1785
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Attachments: hive-1785_3.patch, hive_1785_1.patch, hive_1785_2.patch


 This way, it would be possible to add new parameters to the hooks without 
 changing the existing hooks.
 This will be a incompatible change, and all the hooks need to change to the 
 new API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1785) change Pre/Post Query Hooks to take in 1 parameter: HookContext

2010-11-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1785:
-

Hadoop Flags: [Reviewed]
  Status: Patch Available  (was: Open)

 change Pre/Post Query Hooks to take in 1 parameter: HookContext
 ---

 Key: HIVE-1785
 URL: https://issues.apache.org/jira/browse/HIVE-1785
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Attachments: hive-1785_3.patch, hive_1785_1.patch, hive_1785_2.patch


 This way, it would be possible to add new parameters to the hooks without 
 changing the existing hooks.
 This will be a incompatible change, and all the hooks need to change to the 
 new API

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1611) Add alternative search-provider to Hive site

2010-11-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933065#action_12933065
 ] 

Namit Jain commented on HIVE-1611:
--

@Edward, can we get this in ?


 Add alternative search-provider to Hive site
 

 Key: HIVE-1611
 URL: https://issues.apache.org/jira/browse/HIVE-1611
 Project: Hive
  Issue Type: Improvement
Reporter: Alex Baranau
Assignee: Edward Capriolo
Priority: Minor
 Attachments: HIVE-1611.patch


 Use search-hadoop.com service to make available search in Hive sources, MLs, 
 wiki, etc.
 This was initially proposed on user mailing list. The search service was 
 already added in site's skin (common for all Hadoop related projects) before 
 so this issue is about enabling it for Hive. The ultimate goal is to use it 
 at all Hadoop's sub-projects' sites.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1783) CommonJoinOperator optimize the case of 1:1 join

2010-11-18 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12933516#action_12933516
 ] 

Namit Jain commented on HIVE-1783:
--

+1

running tests

 CommonJoinOperator optimize the case of 1:1 join
 

 Key: HIVE-1783
 URL: https://issues.apache.org/jira/browse/HIVE-1783
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-1783.1.patch, HIVE-1783.2.patch, HIVE-1783.3.patch, 
 HIVE-1783.4.patch


 CommonJoinOperator.genObject() is expensive. It does a recursive and keeps 
 lots of states because it has to:
 1. handle null cases for outer joins
 2. handle the case of duplicated keys from one join party
 We can do a minor optimization to detect a 1:1 join (which is quite common) 
 before calling CommonJoinOperator.genObject() and forward columns in a simple 
 for-loop if we are sure neither of 1 or 2 will happen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1783) CommonJoinOperator optimize the case of 1:1 join

2010-11-18 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1783:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Siying

 CommonJoinOperator optimize the case of 1:1 join
 

 Key: HIVE-1783
 URL: https://issues.apache.org/jira/browse/HIVE-1783
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-1783.1.patch, HIVE-1783.2.patch, HIVE-1783.3.patch, 
 HIVE-1783.4.patch


 CommonJoinOperator.genObject() is expensive. It does a recursive and keeps 
 lots of states because it has to:
 1. handle null cases for outer joins
 2. handle the case of duplicated keys from one join party
 We can do a minor optimization to detect a 1:1 join (which is quite common) 
 before calling CommonJoinOperator.genObject() and forward columns in a simple 
 for-loop if we are sure neither of 1 or 2 will happen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1787) optimize the code path when there are no outer joins

2010-11-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934405#action_12934405
 ] 

Namit Jain commented on HIVE-1787:
--

+1

running tests.

How much improvement did it lead to in the join queries ?

 optimize the code path when there are no outer joins
 

 Key: HIVE-1787
 URL: https://issues.apache.org/jira/browse/HIVE-1787
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-1787.1.patch


 Currently, outer joins and joins are handled in the same manner - a special 
 case for no outer joins would be useful

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-11-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934619#action_12934619
 ] 

Namit Jain commented on HIVE-1648:
--

I haven't taken a look at the code, but here are the comments for the tests


Instead of:


desc extended table_name in the tests,
please use
show table extended like `table_name`;


This will dump stats in a new line and can be easily compared.
The non-deterministic stats are ignored.


Add a test for limit in the sub-query.

Dont select from existing tables: src/src1 for your stats tests.
Create new tables and then set hive.stats.autogather.read to true.
This was, you are sure that the remaining tests will not be affected.

Add another test for 3-way join where the join keys are not the same: something 
like:

select .. from A join B on A.key1 = B.key1 join C on B.key2 = C.key2 where 


 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1805) Ability to create dynamic partitions atomically

2010-11-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934657#action_12934657
 ] 

Namit Jain commented on HIVE-1805:
--

Currently, if a query creates partitions dynamically, some of them may be 
created and some others fail.
It will be useful to have an atomic way to running the query - either all the 
partitions should be created or none of them.

The same problem exists for multi-table inserts, but it is not a very common 
scenario.

 Ability to create dynamic partitions atomically
 ---

 Key: HIVE-1805
 URL: https://issues.apache.org/jira/browse/HIVE-1805
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-11-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934682#action_12934682
 ] 

Namit Jain commented on HIVE-1648:
--

In SemanticAnalyzer:addStatsTask:

 } else {
6177  ListNode children = (ListNode) op.getChildren();
6178  if (children != null) {
6179for (Node child : children) {
6180  opsToProcess.add((Operator? extends Serializable) 
child);
6181}
6182  }


why is the above code block needed ? TableScan can only be at the top.


Also, can you check for Conditional Tasks in addition to MapRedTask ?


 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-11-22 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12934683#action_12934683
 ] 

Namit Jain commented on HIVE-1648:
--

Otherwise, it looks OK

 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1792) track the joins which are being converted to map-join automatically

2010-11-23 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935167#action_12935167
 ] 

Namit Jain commented on HIVE-1792:
--

No need for this

 track the joins which are being converted to map-join automatically
 ---

 Key: HIVE-1792
 URL: https://issues.apache.org/jira/browse/HIVE-1792
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1792-1.patch, hive-1792-2.patch


 We should be able to track how many queries (join) got converted to
 map-join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1792) track the joins which are being converted to map-join automatically

2010-11-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935399#action_12935399
 ] 

Namit Jain commented on HIVE-1792:
--

Why dont we do the same in plan/ConditionalResolverCommonJoin - there we know 
what is going on ?

Also, can we remove the unrelated changes -- for eg. using a different 
DistributedCache API etc. in this patch

 track the joins which are being converted to map-join automatically
 ---

 Key: HIVE-1792
 URL: https://issues.apache.org/jira/browse/HIVE-1792
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1792-1.patch, hive-1792-2.patch, hive-1792-3.patch


 We should be able to track how many queries (join) got converted to
 map-join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1792) track the joins which are being converted to map-join automatically

2010-11-24 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12935450#action_12935450
 ] 

Namit Jain commented on HIVE-1792:
--

+1

running tests

 track the joins which are being converted to map-join automatically
 ---

 Key: HIVE-1792
 URL: https://issues.apache.org/jira/browse/HIVE-1792
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1792-1.patch, hive-1792-2.patch, hive-1792-3.patch, 
 hive-1792-4.patch


 We should be able to track how many queries (join) got converted to
 map-join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1792) track the joins which are being converted to map-join automatically

2010-11-29 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1792.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed. Thanks Liyin

 track the joins which are being converted to map-join automatically
 ---

 Key: HIVE-1792
 URL: https://issues.apache.org/jira/browse/HIVE-1792
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1792-1.patch, hive-1792-2.patch, hive-1792-3.patch, 
 hive-1792-4.patch


 We should be able to track how many queries (join) got converted to
 map-join

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1813) Hive should be able to run on multiple data centers

2010-11-29 Thread Namit Jain (JIRA)
Hive should be able to run on multiple data centers
---

 Key: HIVE-1813
 URL: https://issues.apache.org/jira/browse/HIVE-1813
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
 Fix For: 0.7.0


Currently, hive assumes a single metastore and the HADOOP_HOME is passed as a 
environment variable. 

It would be desirable to support hive on top of multiple data centers (dfs + 
mr).

For eg. there could be 2 metastores: primary and secondary. They would have 
different dfs's , and there will be a
dfs-mr mapping maintained by the metastore.

Hive would be enhanced to support multiple metastores and all operations (ddl + 
query) would span multiple metastores.

Different consistency pluggable policies can be employed - for eg. if a 
table/partition can be present in both the metastores with different
last modification times, either the last one can be used or an error can be 
thrown.

It will be upto the application (outside hive) to copy the data from one 
metastore to another, and to maintain consistency inside.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1813) Hive should be able to run on multiple data centers

2010-11-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965126#action_12965126
 ] 

Namit Jain commented on HIVE-1813:
--

The data can be copied from one dfs to another using distcp - later on a 
wrapper can be developed in hive for the same.
Something like:

alter table T partition P copy src to dst;
alter table T partition P move src to dst;

 Hive should be able to run on multiple data centers
 ---

 Key: HIVE-1813
 URL: https://issues.apache.org/jira/browse/HIVE-1813
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
 Fix For: 0.7.0


 Currently, hive assumes a single metastore and the HADOOP_HOME is passed as a 
 environment variable. 
 It would be desirable to support hive on top of multiple data centers (dfs + 
 mr).
 For eg. there could be 2 metastores: primary and secondary. They would have 
 different dfs's , and there will be a
 dfs-mr mapping maintained by the metastore.
 Hive would be enhanced to support multiple metastores and all operations (ddl 
 + query) would span multiple metastores.
 Different consistency pluggable policies can be employed - for eg. if a 
 table/partition can be present in both the metastores with different
 last modification times, either the last one can be used or an error can be 
 thrown.
 It will be upto the application (outside hive) to copy the data from one 
 metastore to another, and to maintain consistency inside.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-1819) maintain lastAccessTime in the metastore

2010-11-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-1819:


Assignee: Namit Jain

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1819) maintain lastAccessTime in the metastore

2010-11-30 Thread Namit Jain (JIRA)
maintain lastAccessTime in the metastore


 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1819) maintain lastAccessTime in the metastore

2010-11-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1819:
-

Attachment: hive.1819.1.patch

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1819) maintain lastAccessTime in the metastore

2010-11-30 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1819:
-

Status: Patch Available  (was: Open)

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1819) maintain lastAccessTime in the metastore

2010-12-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965736#action_12965736
 ] 

Namit Jain commented on HIVE-1819:
--

The reason I did not use it is because it an int.

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1822) Hive Conf variables should be relative to the dfs

2010-12-01 Thread Namit Jain (JIRA)
Hive Conf variables should be relative to the dfs
-

 Key: HIVE-1822
 URL: https://issues.apache.org/jira/browse/HIVE-1822
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain


Currently, the following parameter:
hive.metastore.warehouse.dir

refers the path completely.

It becomes difficult to maintain if a mapping from Hive Database - DFS is 
added.

This is needed for multi data-center support from Hive.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1820) Make Hive database data center aware

2010-12-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965918#action_12965918
 ] 

Namit Jain commented on HIVE-1820:
--

Going forward, none of the other hive configuration parameters should access 
the dfs directly 

 Make Hive database data center aware
 

 Key: HIVE-1820
 URL: https://issues.apache.org/jira/browse/HIVE-1820
 Project: Hive
  Issue Type: New Feature
Reporter: Ning Zhang
Assignee: Ning Zhang

 In order to support multiple data centers (different DFS, MR clusters) for 
 hive, it is desirable to extend Hive database to be data center aware. 
 Currently Hive database is a logical concept and has no DFS or MR cluster 
 info associated with it. Database has the location property indicating the 
 default warehouse directory, but user cannot specify and change it. In order 
 to make it data center aware, the following info need to be maintained:
 1) data warehouse root location which is the default HDFS location for newly 
 created tables (default=hive.metadata.warehouse.dir).
 2) scratch dir which is the HDFS location where MR intermediate files are 
 created (default=hive.exec.scratch.dir)
 3) MR job tracker URI that jobs should be submitted to 
 (default=mapred.job.tracker)
 4) hadoop (bin) dir ($HADOOP_HOME/bin/hadoop)
 These parameters should be saved in database.parameters (key, value) pair and 
 they overwrite the jobconf parameters (so if the default database has no 
 parameter it will get it from the hive-default.xml or hive-site.xml as it is 
 now). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1819) maintain lastAccessTime in the metastore

2010-12-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1819:
-

Attachment: hive.1819.2.patch

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch, hive.1819.2.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1819) maintain lastAccessTime in the metastore

2010-12-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12965950#action_12965950
 ] 

Namit Jain commented on HIVE-1819:
--

added comments

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch, hive.1819.2.patch, hive.1819.3.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1517) ability to select across a database

2010-12-02 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966243#action_12966243
 ] 

Namit Jain commented on HIVE-1517:
--

We would like to use it right away

 ability to select across a database
 ---

 Key: HIVE-1517
 URL: https://issues.apache.org/jira/browse/HIVE-1517
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: HIVE-1517.1.patch.txt


 After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
 able to select across a database for this feature to be useful.
 For eg:
 use db1
 create table foo();
 use db2
 select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1819) maintain lastAccessTime in the metastore

2010-12-02 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1819:
-

Attachment: hive.1819.4.patch

 maintain lastAccessTime in the metastore
 

 Key: HIVE-1819
 URL: https://issues.apache.org/jira/browse/HIVE-1819
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1819.1.patch, hive.1819.2.patch, hive.1819.3.patch, 
 hive.1819.4.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1826) StatsTask updates the table/partition object leaving a inconsistent version in hooks

2010-12-02 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966357#action_12966357
 ] 

Namit Jain commented on HIVE-1826:
--

The inputs and outputs from the ReadEntity and WriteEntity are passed to the 
hooks.
However, the StatsTask may have updated these objects. 
Isn't it possible that the hooks (post execution) will see a stale version of 
this data ?
And, if these hooks update these objects and write them back to the metastore, 
the Stats changes will be lost.

 StatsTask updates the table/partition object leaving a inconsistent version 
 in hooks
 

 Key: HIVE-1826
 URL: https://issues.apache.org/jira/browse/HIVE-1826
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1828) show locks should not use getTable()/getPartition

2010-12-03 Thread Namit Jain (JIRA)
show locks should not use getTable()/getPartition 
--

 Key: HIVE-1828
 URL: https://issues.apache.org/jira/browse/HIVE-1828
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1822) Hive Conf variables should be relative to the dfs

2010-12-03 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1822:
-

Status: Patch Available  (was: Open)

 Hive Conf variables should be relative to the dfs
 -

 Key: HIVE-1822
 URL: https://issues.apache.org/jira/browse/HIVE-1822
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1822.1.patch


 Currently, the following parameter:
 hive.metastore.warehouse.dir
 refers the path completely.
 It becomes difficult to maintain if a mapping from Hive Database - DFS is 
 added.
 This is needed for multi data-center support from Hive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1822) Hive Conf variables should be relative to the dfs

2010-12-03 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1822:
-

Attachment: hive.1822.1.patch

 Hive Conf variables should be relative to the dfs
 -

 Key: HIVE-1822
 URL: https://issues.apache.org/jira/browse/HIVE-1822
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1822.1.patch


 Currently, the following parameter:
 hive.metastore.warehouse.dir
 refers the path completely.
 It becomes difficult to maintain if a mapping from Hive Database - DFS is 
 added.
 This is needed for multi data-center support from Hive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-12-03 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1290#action_1290
 ] 

Namit Jain commented on HIVE-1648:
--

I dont see any new tests

 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, 
 HIVE-1648.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1828) show locks should not use getTable()/getPartition

2010-12-04 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966847#action_12966847
 ] 

Namit Jain commented on HIVE-1828:
--

One minor comment:

In case of 

show locks T extended;

Does anyone check that the table exists ?

The DDLTask can do that before calling zookeeper 

 show locks should not use getTable()/getPartition 
 --

 Key: HIVE-1828
 URL: https://issues.apache.org/jira/browse/HIVE-1828
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: HIVE-1828.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1828) show locks should not use getTable()/getPartition

2010-12-05 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12966981#action_12966981
 ] 

Namit Jain commented on HIVE-1828:
--

can you add the new patch ?

also, can you add a negative test (if you have not done so already) ?

 show locks should not use getTable()/getPartition 
 --

 Key: HIVE-1828
 URL: https://issues.apache.org/jira/browse/HIVE-1828
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: HIVE-1828.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1828) show locks should not use getTable()/getPartition

2010-12-05 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1828:
-

Status: Open  (was: Patch Available)

 show locks should not use getTable()/getPartition 
 --

 Key: HIVE-1828
 URL: https://issues.apache.org/jira/browse/HIVE-1828
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: HIVE-1828.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1830) mappers in group followed by joins may die OOM

2010-12-05 Thread Namit Jain (JIRA)
mappers in group followed by joins may die OOM
--

 Key: HIVE-1830
 URL: https://issues.apache.org/jira/browse/HIVE-1830
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Liyin Tang




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1830) mappers in group followed by joins may die OOM

2010-12-05 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12967105#action_12967105
 ] 

Namit Jain commented on HIVE-1830:
--

After HIVE-1642, joins are automatically converted into map-joins at physical 
optimization time.

However, this may lead to problems.


For eg:  consider the query:

select T1.val, count(1) from T1 join T2 on T1.key=T2.key group by T1.val


This will have 2 map-reduce jobs, one for the join and the other for group by.

Before HIVE-1642, the partial group for aggregation will be performed in the 
reducer where the join is performed.
However, after HIVE-1642, the same will be performed in the mapper. The local 
task will confirm that there is  just
enough memory to hold the map-join data. Hoever, it does not take into account 
the memory needed for partial group
by.

So, in case there is group by followed by join, it is a good idea to reduce the 
memory given to the local task to validate
if there is enough memory to fit small table - it can be controlled by a new 
configuration paramter, but it can be some
default: say 70% of total memory (instead of 90%).

Also, the group by may still run out of memory, so it might be a good idea to 
check in group by for free memory and
periodically flush memory

 mappers in group followed by joins may die OOM
 --

 Key: HIVE-1830
 URL: https://issues.apache.org/jira/browse/HIVE-1830
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Liyin Tang



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1831) Add a option to run task to check map-join possibility in non-local mode

2010-12-05 Thread Namit Jain (JIRA)
Add a option to run task to check map-join possibility in non-local mode


 Key: HIVE-1831
 URL: https://issues.apache.org/jira/browse/HIVE-1831
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang


In HIVE-1642, we run a local task to figure out if the small table can be held 
in memory, and then convert the join into a map-join.
However, this can be a good idea for thin clients (which may not have enough 
memory).

This should be made configurable - where the default can still be to run the 
task locally on the client machine, but an option
should be added for thin clients, where the task would be run as a map-only task

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1834) more debugging for locking

2010-12-06 Thread Namit Jain (JIRA)
more debugging for locking
--

 Key: HIVE-1834
 URL: https://issues.apache.org/jira/browse/HIVE-1834
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain


Along with the time and the queryid, it might be a good idea to log if the lock 
was acquired explicitly (by a lock command)
or implicitly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1823) upgrade the database thrift interface to allow parameters key-value pairs

2010-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968463#action_12968463
 ] 

Namit Jain commented on HIVE-1823:
--

+1

running tests

 upgrade the database thrift interface to allow parameters key-value pairs
 -

 Key: HIVE-1823
 URL: https://issues.apache.org/jira/browse/HIVE-1823
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1823.patch


 In order to store data center specify parameters to Hive database, it is 
 desirable to extend Hive database thrift interface with a parameters map 
 similar to Table and Partitions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist

2010-12-06 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968612#action_12968612
 ] 

Namit Jain commented on HIVE-1763:
--

+1

The approach looks fine

 drop table (or view) should issue warning if table doesn't exist
 

 Key: HIVE-1763
 URL: https://issues.apache.org/jira/browse/HIVE-1763
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: dan f
Assignee: Paul Butler
Priority: Minor
 Attachments: HIVE-1763.patch


 drop table reports OK even if the table doesn't exist.  Better to report 
 something like mysql's Unknown table 'foo' so that, e.g., unwanted tables 
 (especially ones with names prone to typos) don't persist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1823) upgrade the database thrift interface to allow parameters key-value pairs

2010-12-07 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1823.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed. Thanks Ning

 upgrade the database thrift interface to allow parameters key-value pairs
 -

 Key: HIVE-1823
 URL: https://issues.apache.org/jira/browse/HIVE-1823
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1823.2.patch, HIVE-1823.patch


 In order to store data center specify parameters to Hive database, it is 
 desirable to extend Hive database thrift interface with a parameters map 
 similar to Table and Partitions. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist

2010-12-07 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1763:
-

Status: Open  (was: Patch Available)

 drop table (or view) should issue warning if table doesn't exist
 

 Key: HIVE-1763
 URL: https://issues.apache.org/jira/browse/HIVE-1763
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: dan f
Assignee: Paul Butler
Priority: Minor
 Attachments: HIVE-1763.patch


 drop table reports OK even if the table doesn't exist.  Better to report 
 something like mysql's Unknown table 'foo' so that, e.g., unwanted tables 
 (especially ones with names prone to typos) don't persist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1763) drop table (or view) should issue warning if table doesn't exist

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968835#action_12968835
 ] 

Namit Jain commented on HIVE-1763:
--

However, it will need a lot of test result files to be updated.
Most of the tests will break

 drop table (or view) should issue warning if table doesn't exist
 

 Key: HIVE-1763
 URL: https://issues.apache.org/jira/browse/HIVE-1763
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: dan f
Assignee: Paul Butler
Priority: Minor
 Attachments: HIVE-1763.patch


 drop table reports OK even if the table doesn't exist.  Better to report 
 something like mysql's Unknown table 'foo' so that, e.g., unwanted tables 
 (especially ones with names prone to typos) don't persist.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968841#action_12968841
 ] 

Namit Jain commented on HIVE-1648:
--

@Yongqiang, you have missed the test changes in the patch - can you add them 
also ?

 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, 
 HIVE-1648.patch, hive-1648.svn.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1508) Add cleanup method to HiveHistory class

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968845#action_12968845
 ] 

Namit Jain commented on HIVE-1508:
--

+1

 Add cleanup method to HiveHistory class
 ---

 Key: HIVE-1508
 URL: https://issues.apache.org/jira/browse/HIVE-1508
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Anurag Phadke
Assignee: Edward Capriolo
Priority: Blocker
 Fix For: 0.7.0

 Attachments: hive-1508-1-patch.txt


 Running hive server for long time  90 minutes results in too many open 
 file-handles, eventually causing the server to crash as the server runs out 
 of file handle.
 Actual bug as described by Carl Steinbach:
 the hive_job_log_* files are created by the HiveHistory class. This class 
 creates a PrintWriter for writing to the file, but never closes the writer. 
 It looks like we need to add a cleanup method to HiveHistory that closes the 
 PrintWriter and does any other necessary cleanup. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1821) describe database command

2010-12-07 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1821.
--

Resolution: Duplicate

Duplicate of HIVE-1836

 describe database command
 -

 Key: HIVE-1821
 URL: https://issues.apache.org/jira/browse/HIVE-1821
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Ning Zhang

 a describe (extended) database command would be helpful if we introduces 
 parameters associated with databases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1821) describe database command

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12968939#action_12968939
 ] 

Namit Jain commented on HIVE-1821:
--

If you are doing this, do you want to add a 'alter database' also ?

 describe database command
 -

 Key: HIVE-1821
 URL: https://issues.apache.org/jira/browse/HIVE-1821
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Ning Zhang

 a describe (extended) database command would be helpful if we introduces 
 parameters associated with databases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1836) Extend the CREATE DATABASE command with DBPROPERTIES

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969037#action_12969037
 ] 

Namit Jain commented on HIVE-1836:
--

+1

 Extend the CREATE DATABASE command with DBPROPERTIES
 

 Key: HIVE-1836
 URL: https://issues.apache.org/jira/browse/HIVE-1836
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1836.patch


 We should be able to assign key-value pairs of properties to Hive databases. 
 The proposed syntax is similar to the CREATE TABLE and CREATE INDEX commands:
 {code}
 CREATE DATABASE DB_NAME WITH DBPROPERTIES ('key1' = 'value1', 'key2' = 
 'value2');
 {code}
 The 
 {code}
 DESC DATABASE EXTENDED DB_NAME;
 {code}
 should be able to display the properties. (requires HIVE-1821)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1096) Hive Variables

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969089#action_12969089
 ] 

Namit Jain commented on HIVE-1096:
--

sure, that would be very useful

Let me know if you run into any issues 

 Hive Variables
 --

 Key: HIVE-1096
 URL: https://issues.apache.org/jira/browse/HIVE-1096
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 0.7.0

 Attachments: 1096-9.diff, hive-1096-10-patch.txt, 
 hive-1096-11-patch.txt, hive-1096-12.patch.txt, hive-1096-15.patch.txt, 
 hive-1096-15.patch.txt, hive-1096-2.diff, hive-1096-20.patch.txt, 
 hive-1096-7.diff, hive-1096-8.diff, hive-1096.diff


 From mailing list:
 --Amazon Elastic MapReduce version of Hive seems to have a nice feature 
 called Variables. Basically you can define a variable via command-line 
 while invoking hive with -d DT=2009-12-09 and then refer to the variable via 
 ${DT} within the hive queries. This could be extremely useful. I can't seem 
 to find this feature even on trunk. Is this feature currently anywhere in the 
 roadmap?--
 This could be implemented in many places.
 A simple place to put this is 
 in Driver.compile or Driver.run we can do string substitutions at that level, 
 and further downstream need not be effected. 
 There could be some benefits to doing this further downstream, parser,plan. 
 but based on the simple needs we may not need to overthink this.
 I will get started on implementing in compile unless someone wants to discuss 
 this more.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1837) optional timeout for hive clients

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969093#action_12969093
 ] 

Namit Jain commented on HIVE-1837:
--

@Ashutosh, we cant wait for this feature till secure hadoop is available.
Once Hive is migrated to that, we can change the implementation of this feature.

@Yongqiang, can you add the new parameter definition in hive-default.xml ?
Also, can you make the thread sleep time (10 min.) configurable ?
Can you add a new test for the same - I mean, have a very small timeout and 
thread sleep time,
and a custom script which is sleeping indefinitely ? 



 optional timeout for hive clients
 -

 Key: HIVE-1837
 URL: https://issues.apache.org/jira/browse/HIVE-1837
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: hive-1837.1.patch, hive-1837.2.patch


 It would be a good idea to have a optional timeout for hive clients.
 We encountered a query today, which seemed to have run by mistake, and it was 
 running for about a month.
 This was holding zookeeper locks, and making the whole debugging more complex 
 than it should be.
 It would be a good idea to have a timeout for a hive client.
 @Ning, I remember there was some issue with the Hive client having a timeout 
 of 1 day with HiPal.
 Do you remember the details ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1830) mappers in group followed by joins may die OOM

2010-12-07 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969203#action_12969203
 ] 

Namit Jain commented on HIVE-1830:
--

  if (groupByOp.getConf() == null) {
91  System.out.println(Group by desc is null);
92  return null;
93}




This should never happen


GroupByOperator:
memoryThreshold = HiveConf.getFloatVar(hconf, 
HiveConf.ConfVars.HIVEMAPAGGRM⬅
EMORYTHRESHOLD);


This should also be in groupByDesc



 mappers in group followed by joins may die OOM
 --

 Key: HIVE-1830
 URL: https://issues.apache.org/jira/browse/HIVE-1830
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Liyin Tang
 Attachments: hive-1830-1.patch, hive-1830-2.patch, hive-1830-3.patch, 
 hive-1830-4.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1837) optional timeout for hive clients

2010-12-08 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969562#action_12969562
 ] 

Namit Jain commented on HIVE-1837:
--

OK, the changes look good.

+1

 optional timeout for hive clients
 -

 Key: HIVE-1837
 URL: https://issues.apache.org/jira/browse/HIVE-1837
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: hive-1837.1.patch, hive-1837.2.patch


 It would be a good idea to have a optional timeout for hive clients.
 We encountered a query today, which seemed to have run by mistake, and it was 
 running for about a month.
 This was holding zookeeper locks, and making the whole debugging more complex 
 than it should be.
 It would be a good idea to have a timeout for a hive client.
 @Ning, I remember there was some issue with the Hive client having a timeout 
 of 1 day with HiPal.
 Do you remember the details ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1842) Add the local flag to all the map red tasks, if the query is running locally.

2010-12-08 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969565#action_12969565
 ] 

Namit Jain commented on HIVE-1842:
--

+1

 Add the local flag to all the map red tasks, if the query is running locally.
 -

 Key: HIVE-1842
 URL: https://issues.apache.org/jira/browse/HIVE-1842
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.4.1
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hive-1842-1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1844) Hanging hive client caused by TaskRunner's OutOfMemoryError

2010-12-08 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969614#action_12969614
 ] 

Namit Jain commented on HIVE-1844:
--

Great find, Yongqiang


+1

 Hanging hive client caused by TaskRunner's OutOfMemoryError
 ---

 Key: HIVE-1844
 URL: https://issues.apache.org/jira/browse/HIVE-1844
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: hive-1844.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1842) Add the local flag to all the map red tasks, if the query is running locally.

2010-12-08 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1842:
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks Liyin

 Add the local flag to all the map red tasks, if the query is running locally.
 -

 Key: HIVE-1842
 URL: https://issues.apache.org/jira/browse/HIVE-1842
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.4.1
Reporter: Liyin Tang
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1842-1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-12-08 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969633#action_12969633
 ] 

Namit Jain commented on HIVE-1526:
--

@Ning, can you take care of this ?
So many other patches are waiting for this ?

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.7.0

 Attachments: compile.err, HIVE-1526-complete.4.patch.txt, 
 HIVE-1526-complete.5.patch.txt, HIVE-1526-complete.6.patch.txt, 
 HIVE-1526-complete.7.patch.txt, HIVE-1526-complete.8.patch.txt, 
 HIVE-1526-no-codegen.3.patch.txt, HIVE-1526-no-codegen.4.patch.txt, 
 HIVE-1526-no-codegen.5.patch.txt, HIVE-1526-no-codegen.6.patch.txt, 
 HIVE-1526-no-codegen.7.patch.txt, HIVE-1526-no-codegen.8.patch.txt, 
 HIVE-1526.2.patch.txt, HIVE-1526.3.patch.txt, hive-1526.txt, libfb303.jar, 
 libthrift.jar, serde2_test.patch, svn_rm.sh, test.log, thrift-0.5.0.jar, 
 thrift-fb303-0.5.0.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1830) mappers in group followed by joins may die OOM

2010-12-08 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1830.
--

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]

Committed. Thanks Liyin

 mappers in group followed by joins may die OOM
 --

 Key: HIVE-1830
 URL: https://issues.apache.org/jira/browse/HIVE-1830
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0

 Attachments: hive-1830-1.patch, hive-1830-2.patch, hive-1830-3.patch, 
 hive-1830-4.patch, hive-1830-5.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1844) Hanging hive client caused by TaskRunner's OutOfMemoryError

2010-12-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1844:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Yongqiang

 Hanging hive client caused by TaskRunner's OutOfMemoryError
 ---

 Key: HIVE-1844
 URL: https://issues.apache.org/jira/browse/HIVE-1844
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: hive-1844.1.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1843) add an option in dynamic partition inserts to throw an error if 0 partitions are created

2010-12-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12969861#action_12969861
 ] 

Namit Jain commented on HIVE-1843:
--

+1


 add an option in dynamic partition inserts to throw an error if 0 partitions 
 are created
 

 Key: HIVE-1843
 URL: https://issues.apache.org/jira/browse/HIVE-1843
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang
 Attachments: HIVE-1843.patch


 Currently, we print a error message in that scenario.
 However, it would be very useful if an option was added where we would error 
 out.
 This would help a lot in debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1648) Automatically gathering stats when reading a table/partition

2010-12-09 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970009#action_12970009
 ] 

Namit Jain commented on HIVE-1648:
--

1. QBParseInfo: add setDestToLimit() for symmetry()
2. I am not sure any of your tests are working - set hive.stats.autogather = 
false
before you create the tables for which you want the stats to be populated while 
reading.

Clearly, this is the reason why piggyback_part.q is working.

2. piggyback_join.q

End:


show table extended like piggy_table3;
drop table piggy_table;


Add:

show table extended like piggy_table1;
show table extended like piggy_table2;


Also, add a test where you are joining:

piggyback_table1 a join
piggyback_table2 b on a.key = b.key join
piggyback_table3 c b.key = c.key

and then show table extended all the 3 tables.

3. piggyback_limit.q
add:

show table extended like piggy_table1;


before the end.
It should have no stats

4. piggbyback_subq.q and _union.q are wrong - you need to create new tables,
and then show table extended them at the end, just like other tests.

5.


 Automatically gathering stats when reading a table/partition
 

 Key: HIVE-1648
 URL: https://issues.apache.org/jira/browse/HIVE-1648
 Project: Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Paul Butler
 Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.4.patch, 
 HIVE-1648.5.patch, HIVE-1648.patch, hive-1648.svn.patch


 HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
 gathering stats. This requires additional scan of the data. Stats gathering 
 can be piggy-backed on TableScanOperator whenever a table/partition is 
 scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1843) add an option in dynamic partition inserts to throw an error if 0 partitions are created

2010-12-09 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1843:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Ning

 add an option in dynamic partition inserts to throw an error if 0 partitions 
 are created
 

 Key: HIVE-1843
 URL: https://issues.apache.org/jira/browse/HIVE-1843
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang
 Attachments: HIVE-1843.patch


 Currently, we print a error message in that scenario.
 However, it would be very useful if an option was added where we would error 
 out.
 This would help a lot in debugging

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1694) Accelerate query execution using indexes

2010-12-10 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12970245#action_12970245
 ] 

Namit Jain commented on HIVE-1694:
--

I think having a mechanism which lets is issue internal or recursive sql is 
better in the long term.
That is something we will need anyway for future optimizations.

We can create a thin API around SemanticAnalyzer (analyze etc.), which is 
indirectly present in Driver.
Another implementation of that API can be the internal API, say RecursiveDriver.
In a recursive context, you are only allowed to invoke RecursiveDriver. 
External Clients (CliDriver, HiveServer etc.) invoke Driver directly.

As John said, definitely keep your optimizations pluggable. Currently, they are 
invoked as rule-based, 
but should be flexible enough to be invoked based on some costs in the future.

 Accelerate query execution using indexes
 

 Key: HIVE-1694
 URL: https://issues.apache.org/jira/browse/HIVE-1694
 Project: Hive
  Issue Type: New Feature
  Components: Indexing, Query Processor
Affects Versions: 0.7.0
Reporter: Nikhil Deshpande
Assignee: Nikhil Deshpande
 Attachments: demo_q1.hql, demo_q2.hql, HIVE-1694_2010-10-28.diff


 The index building patch (Hive-417) is checked into trunk, this JIRA issue 
 tracks supporting indexes in Hive compiler  execution engine for SELECT 
 queries.
 This is in ref. to John's comment at
 https://issues.apache.org/jira/browse/HIVE-417?focusedCommentId=12884869page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12884869
 on creating separate JIRA issue for tracking index usage in optimizer  query 
 execution.
 The aim of this effort is to use indexes to accelerate query execution (for 
 certain class of queries). E.g.
 - Filters and range scans (already being worked on by He Yongqiang as part of 
 HIVE-417?)
 - Joins (index based joins)
 - Group By, Order By and other misc cases
 The proposal is multi-step:
 1. Building index based operators, compiler and execution engine changes
 2. Optimizer enhancements (e.g. cost-based optimizer to compare and choose 
 between index scans, full table scans etc.)
 This JIRA initially focuses on the first step. This JIRA is expected to hold 
 the information about index based plans  operator implementations for above 
 mentioned cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1847) option of continue on error

2010-12-10 Thread Namit Jain (JIRA)
option of continue on error
---

 Key: HIVE-1847
 URL: https://issues.apache.org/jira/browse/HIVE-1847
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain


In hive -f script, if any sql/command fails in that script than hive exists 
with exit status -1, without continuing the remaining hive commands.
Sometimes it is better to continue the script even during errors. 
For example, if a hive sql script contains many drop table commands, the 
command would exit when it could not find a table. But in this case, it is 
preferable to continue dropping remaining tables

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1848) bug in MAPJOIN

2010-12-13 Thread Namit Jain (JIRA)
bug in MAPJOIN
--

 Key: HIVE-1848
 URL: https://issues.apache.org/jira/browse/HIVE-1848
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain


explain
FROM srcpart c
JOIN srcpart d
ON ( c.key=d.key AND c.ds='2008-04-08' AND  d.ds='2008-04-08')
SELECT /*+ MAPJOIN(d) */ DISTINCT c.campaign_id;

The above query throws an error:


FAILED: Error in semantic analysis: line 0:-1 Invalid Function TOK_MAPJOIN


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1849) add more logging to partition pruning

2010-12-13 Thread Namit Jain (JIRA)
add more logging to partition pruning
-

 Key: HIVE-1849
 URL: https://issues.apache.org/jira/browse/HIVE-1849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain


In facebook, we are seeing some intermittent errors, where it seems that either 
all the partitions are not returned by the metastore
or some of them are pruned wrongly.

This patch adds more logging for debugging such scenarios.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1849) add more logging to partition pruning

2010-12-13 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1849:
-

Attachment: hive.1849.1.patch

 add more logging to partition pruning
 -

 Key: HIVE-1849
 URL: https://issues.apache.org/jira/browse/HIVE-1849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1849.1.patch


 In facebook, we are seeing some intermittent errors, where it seems that 
 either all the partitions are not returned by the metastore
 or some of them are pruned wrongly.
 This patch adds more logging for debugging such scenarios.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1695) MapJoin followed by ReduceSink should be done as single MapReduce Job

2010-12-13 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1695:
-

Status: Open  (was: Patch Available)

 MapJoin followed by ReduceSink should be done as single MapReduce Job
 -

 Key: HIVE-1695
 URL: https://issues.apache.org/jira/browse/HIVE-1695
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Amareshwari Sriramadasu
Assignee: Sreekanth Ramakrishnan
 Attachments: hive-1695-1.patch, hive-1695.patch


 Currently MapJoin followed by ReduceSink runs as two MapReduce jobs : One map 
 only job followed by a Map-Reduce job. It can be combined into single 
 MapReduce Job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-78) Authorization infrastructure for Hive

2010-12-13 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-78:
---

Status: Open  (was: Patch Available)

 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor, Server Infrastructure
Reporter: Ashish Thusoo
Assignee: He Yongqiang
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, HIVE-78.1.nothrift.patch, HIVE-78.1.thrift.patch, 
 HIVE-78.2.nothrift.patch, HIVE-78.2.thrift.patch, HIVE-78.4.complete.patch, 
 HIVE-78.4.no_thrift.patch, HIVE-78.5.complete.patch, 
 HIVE-78.5.no_thrift.patch, HIVE-78.6.complete.patch, 
 HIVE-78.6.no_thrift.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1848) bug in MAPJOIN

2010-12-14 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1848:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Yongqiang

 bug in MAPJOIN
 --

 Key: HIVE-1848
 URL: https://issues.apache.org/jira/browse/HIVE-1848
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: He Yongqiang
 Attachments: hive-1848.1.patch


 explain
 FROM srcpart c
 JOIN srcpart d
 ON ( c.key=d.key AND c.ds='2008-04-08' AND  d.ds='2008-04-08')
 SELECT /*+ MAPJOIN(d) */ DISTINCT c.campaign_id;
 The above query throws an error:
 FAILED: Error in semantic analysis: line 0:-1 Invalid Function TOK_MAPJOIN

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1845) Some attributes in the Eclipse template file is deprecated

2010-12-14 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1845:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Liyin

 Some attributes in the Eclipse template file is deprecated  
 

 Key: HIVE-1845
 URL: https://issues.apache.org/jira/browse/HIVE-1845
 Project: Hive
  Issue Type: Bug
Reporter: Liyin Tang
Assignee: Liyin Tang
 Attachments: hive-1845-1.patch


 In the eclipse template file, it will reference this jar file, which is 
 deprecated.
 /@PROJECT@/build/metastore/hive-mod...@hive_version@.jar
 So the correct one should be:
 /@PROJECT@/build/metastore/hive-metasto...@hive_version@.jar
 Just update all the eclipse template files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1849) add more logging to partition pruning

2010-12-14 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971425#action_12971425
 ] 

Namit Jain commented on HIVE-1849:
--

We need this log to confirm that

 add more logging to partition pruning
 -

 Key: HIVE-1849
 URL: https://issues.apache.org/jira/browse/HIVE-1849
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.1849.1.patch


 In facebook, we are seeing some intermittent errors, where it seems that 
 either all the partitions are not returned by the metastore
 or some of them are pruned wrongly.
 This patch adds more logging for debugging such scenarios.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1851) wrong number of rows inserted reported by Hive

2010-12-14 Thread Namit Jain (JIRA)
wrong number of rows inserted reported by Hive
--

 Key: HIVE-1851
 URL: https://issues.apache.org/jira/browse/HIVE-1851
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang


The counters that hive uses to report the number of rows inserted are not very 
reliable.
Unless they become correct, it is a good idea to disable these reports.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1851) wrong number of rows inserted reported by Hive

2010-12-14 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1851.
--

Resolution: Duplicate

Duplicate of https://issues.apache.org/jira/browse/HIVE-934


 wrong number of rows inserted reported by Hive
 --

 Key: HIVE-1851
 URL: https://issues.apache.org/jira/browse/HIVE-1851
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Ning Zhang

 The counters that hive uses to report the number of rows inserted are not 
 very reliable.
 Unless they become correct, it is a good idea to disable these reports.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-15 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971870#action_12971870
 ] 

Namit Jain commented on HIVE-1806:
--

+1

 The merge criteria on dynamic partitons should be per partiton
 --

 Key: HIVE-1806
 URL: https://issues.apache.org/jira/browse/HIVE-1806
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1806.patch


 Currently the criteria of whether a merge job should be fired on dynamic 
 generated partitions are is the average file size of files across all dynamic 
 partitions. It is very common that some dynamic partitions contains mostly 
 large files and some contains mostly small files. Even though the average 
 size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
 should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-15 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971977#action_12971977
 ] 

Namit Jain commented on HIVE-1806:
--

test dyn_part_empty.q failed - can you take a look ?

 The merge criteria on dynamic partitons should be per partiton
 --

 Key: HIVE-1806
 URL: https://issues.apache.org/jira/browse/HIVE-1806
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1806.patch


 Currently the criteria of whether a merge job should be fired on dynamic 
 generated partitions are is the average file size of files across all dynamic 
 partitions. It is very common that some dynamic partitions contains mostly 
 large files and some contains mostly small files. Even though the average 
 size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
 should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-15 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1806:
-

Status: Open  (was: Patch Available)

 The merge criteria on dynamic partitons should be per partiton
 --

 Key: HIVE-1806
 URL: https://issues.apache.org/jira/browse/HIVE-1806
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1806.patch


 Currently the criteria of whether a merge job should be fired on dynamic 
 generated partitions are is the average file size of files across all dynamic 
 partitions. It is very common that some dynamic partitions contains mostly 
 large files and some contains mostly small files. Even though the average 
 size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
 should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1853) downgrade JDO version

2010-12-15 Thread Namit Jain (JIRA)
downgrade JDO version
-

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Paul Yang


After HIVE-1609, we are seeing some table not found errors intermittently.
We have a test case where 5 processes are concurrently issueing the same query 
- 
explain extended insert .. select from T

and once in a while, we get a error T not found - 
When we revert back the JDO version, the error is gone.

We can investigate later to find the JDO bug, but for now this is a 
show-stopper for facebook, and needs
to be reverted back immediately.

This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1853) downgrade JDO version

2010-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972198#action_12972198
 ] 

Namit Jain commented on HIVE-1853:
--

+1


Running tests

 downgrade JDO version
 -

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Namit Jain
Assignee: Paul Yang
 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch


 After HIVE-1609, we are seeing some table not found errors intermittently.
 We have a test case where 5 processes are concurrently issueing the same 
 query - 
 explain extended insert .. select from T
 and once in a while, we get a error T not found - 
 When we revert back the JDO version, the error is gone.
 We can investigate later to find the JDO bug, but for now this is a 
 show-stopper for facebook, and needs
 to be reverted back immediately.
 This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1853) downgrade JDO version

2010-12-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1853:
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks Paul

 downgrade JDO version
 -

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Namit Jain
Assignee: Paul Yang
 Fix For: 0.7.0

 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch


 After HIVE-1609, we are seeing some table not found errors intermittently.
 We have a test case where 5 processes are concurrently issueing the same 
 query - 
 explain extended insert .. select from T
 and once in a while, we get a error T not found - 
 When we revert back the JDO version, the error is gone.
 We can investigate later to find the JDO bug, but for now this is a 
 show-stopper for facebook, and needs
 to be reverted back immediately.
 This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1853) downgrade JDO version

2010-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972276#action_12972276
 ] 

Namit Jain commented on HIVE-1853:
--

Also, is there some other more stable version of JDO which does not have this 
problem ?

 downgrade JDO version
 -

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Namit Jain
Assignee: Paul Yang
 Fix For: 0.7.0

 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch


 After HIVE-1609, we are seeing some table not found errors intermittently.
 We have a test case where 5 processes are concurrently issueing the same 
 query - 
 explain extended insert .. select from T
 and once in a while, we get a error T not found - 
 When we revert back the JDO version, the error is gone.
 We can investigate later to find the JDO bug, but for now this is a 
 show-stopper for facebook, and needs
 to be reverted back immediately.
 This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1853) downgrade JDO version

2010-12-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972280#action_12972280
 ] 

Namit Jain commented on HIVE-1853:
--

Ashutosh, what is your timeline ?

Right now, we dont have the infra-structure in place to pick some patches and 
ignore others.
We pick all the patches from the open source to our internal tree.

For the time it will take us to develop this, can you live with the current 
trunk 
(lower JDO) ?

 downgrade JDO version
 -

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Namit Jain
Assignee: Paul Yang
 Fix For: 0.7.0

 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch


 After HIVE-1609, we are seeing some table not found errors intermittently.
 We have a test case where 5 processes are concurrently issueing the same 
 query - 
 explain extended insert .. select from T
 and once in a while, we get a error T not found - 
 When we revert back the JDO version, the error is gone.
 We can investigate later to find the JDO bug, but for now this is a 
 show-stopper for facebook, and needs
 to be reverted back immediately.
 This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()

2010-12-19 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1854:
-

Status: Open  (was: Patch Available)

 Temporarily disable metastore tests for listPartitionsByFilter()
 

 Key: HIVE-1854
 URL: https://issues.apache.org/jira/browse/HIVE-1854
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Paul Yang
Assignee: Paul Yang
Priority: Minor
 Attachments: HIVE-1854.1.patch


 After the JDO downgrade in HIVE-1853, the tests for the disabled function 
 listPartitionByFilter() should be disabled as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()

2010-12-20 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973322#action_12973322
 ] 

Namit Jain commented on HIVE-1854:
--

+1



 Temporarily disable metastore tests for listPartitionsByFilter()
 

 Key: HIVE-1854
 URL: https://issues.apache.org/jira/browse/HIVE-1854
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Paul Yang
Assignee: Paul Yang
Priority: Minor
 Attachments: HIVE-1854.1.patch


 After the JDO downgrade in HIVE-1853, the tests for the disabled function 
 listPartitionByFilter() should be disabled as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1854) Temporarily disable metastore tests for listPartitionsByFilter()

2010-12-20 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-1854.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed. Thanks Paul

 Temporarily disable metastore tests for listPartitionsByFilter()
 

 Key: HIVE-1854
 URL: https://issues.apache.org/jira/browse/HIVE-1854
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
Reporter: Paul Yang
Assignee: Paul Yang
Priority: Minor
 Attachments: HIVE-1854.1.patch


 After the JDO downgrade in HIVE-1853, the tests for the disabled function 
 listPartitionByFilter() should be disabled as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1855) Include Process ID in the log4j log file name

2010-12-20 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973383#action_12973383
 ] 

Namit Jain commented on HIVE-1855:
--

+1

 Include Process ID in the log4j log file name
 -

 Key: HIVE-1855
 URL: https://issues.apache.org/jira/browse/HIVE-1855
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1855.patch


 Hive client side always log into /tmp/${user.name}/hive.log. If there are 
 multipel CLI running on the same host, logging could be stopped or if it is 
 not it's difficult to distinguish messages between them. It would be easier 
 for debugging if different CLI output to different log files. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1855) Include Process ID in the log4j log file name

2010-12-20 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1855:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Ning

 Include Process ID in the log4j log file name
 -

 Key: HIVE-1855
 URL: https://issues.apache.org/jira/browse/HIVE-1855
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1855.patch


 Hive client side always log into /tmp/${user.name}/hive.log. If there are 
 multipel CLI running on the same host, logging could be stopped or if it is 
 not it's difficult to distinguish messages between them. It would be easier 
 for debugging if different CLI output to different log files. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1853) downgrade JDO version

2010-12-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973847#action_12973847
 ] 

Namit Jain commented on HIVE-1853:
--

Unfortunately, the query that I was running used some production tables.
I will try to reproduce the query with some non-production tables.

 downgrade JDO version
 -

 Key: HIVE-1853
 URL: https://issues.apache.org/jira/browse/HIVE-1853
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Namit Jain
Assignee: Paul Yang
 Fix For: 0.7.0

 Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch


 After HIVE-1609, we are seeing some table not found errors intermittently.
 We have a test case where 5 processes are concurrently issueing the same 
 query - 
 explain extended insert .. select from T
 and once in a while, we get a error T not found - 
 When we revert back the JDO version, the error is gone.
 We can investigate later to find the JDO bug, but for now this is a 
 show-stopper for facebook, and needs
 to be reverted back immediately.
 This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx

2010-12-21 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1818:
-

Status: Patch Available  (was: Open)

 Call frequency and duration metrics for HiveMetaStore via jmx
 -

 Key: HIVE-1818
 URL: https://issues.apache.org/jira/browse/HIVE-1818
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Sushanth Sowmyan
Priority: Minor
 Attachments: HIVE-1818.patch


 As recently brought up in the hive-dev mailing list, it'd be useful if the 
 HiveMetaStore had some sort of instrumentation capability so as to measure 
 frequency of calls to various calls on the HiveMetaStore and the duration of 
 time spent in these calls. 
 There are already incrementCounter() and logStartFunction() / 
 logStartTableFunction() ,etc calls in HiveMetaStore, and they could be 
 refactored/repurposed to make calls that expose JMX MBeans as well. Or, a 
 Metrics subsystem could be introduced which made calls to 
 incrementCounter()/etc as a refactor.
 It might also be possible to specify a -D parameter that the Metrics 
 subsystem could use to determine whether or not to be enabled, and if so, on 
 to what port. And once we have the capability to instrument and expose 
 MBeans, it might also be possible for other subsystems to also adopt and use 
 this system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



<    1   2   3   4   5   6   7   8   9   10   >