[jira] [Commented] (HIVE-1643) support range scans and non-key columns in HBase filter pushdown

2011-09-22 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13112791#comment-13112791
 ] 

Vaibhav Aggarwal commented on HIVE-1643:


I have been looking into this since last 3 days.

I would ideally like to break this into:

1. Add support for range query on primary key
2. Add support for filter pushdown on non primary key columns

I will try to submit a patch for 1. soon.

 support range scans and non-key columns in HBase filter pushdown
 

 Key: HIVE-1643
 URL: https://issues.apache.org/jira/browse/HIVE-1643
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Vaibhav Aggarwal

 HIVE-1226 added support for WHERE rowkey=3.  We would like to support WHERE 
 rowkey BETWEEN 10 and 20, as well as predicates on non-rowkeys (plus 
 conjunctions etc).  Non-rowkey conditions can't be used to filter out entire 
 ranges, but they can be used to push the per-row filter processing as far 
 down as possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2020) Create a separate namespace for Hive variables

2011-09-01 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095701#comment-13095701
 ] 

Vaibhav Aggarwal commented on HIVE-2020:


Thanks for looking at this Carl!

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Fix For: 0.8.0

 Attachments: HIVE-2020-2.patch, HIVE-2020-3.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'set hivevar:x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2266) Fix compression parameters

2011-09-01 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095747#comment-13095747
 ] 

Vaibhav Aggarwal commented on HIVE-2266:


This patch attempts to fix a bug in the existing functionality in two ways:

1. In HiveFileFormatUtils.java, wrong jobconf is getting passed which is clear 
from the context.

2. In other cases the compression parameters are not getting set.

The only difference this patch produces from the current behavior is smaller 
file sizes on file system. I am not sure how to write a hive query which can 
verify difference in file sizes. Do you have any ideas which can help me add 
some quick tests for this? The current test executes though the code checking 
that it does not result in any Exception or Error. It does not compare file 
size.


 Really? Which platforms are you talking about? Can you tell me how to 
 reproduce this interesting behavior?

Hadoop loads native compression libraries. I believe that they are platform 
dependent hence I do not assume that they always have same compression ratio. 
Please correct me if I am wrong here.

In any case I think this is a broken existing functionality in Hive which we 
should fix.

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266-2.patch, HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2266) Fix compression parameters

2011-08-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2266:
---

Status: Patch Available  (was: Open)

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266-2.patch, HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2178) Log related Check style Comments fixes

2011-08-23 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089699#comment-13089699
 ] 

Vaibhav Aggarwal commented on HIVE-2178:


This is a good patch. I have noticed a number of cases here root cause of 
exception is missing.

 Log related Check style Comments fixes
 --

 Key: HIVE-2178
 URL: https://issues.apache.org/jira/browse/HIVE-2178
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2178.1.patch, HIVE-2178.patch


 Fix Log related Check style Comments

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2266) Fix compression parameters

2011-08-23 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2266:
---

Status: Patch Available  (was: Open)

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266-2.patch, HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2266) Fix compression parameters

2011-08-23 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2266:
---

Attachment: HIVE-2266-2.patch

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266-2.patch, HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1643) support range scans and non-key columns in HBase filter pushdown

2011-08-23 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089927#comment-13089927
 ] 

Vaibhav Aggarwal commented on HIVE-1643:


I would like to work on this if it not being worked on actively as of now.

 support range scans and non-key columns in HBase filter pushdown
 

 Key: HIVE-1643
 URL: https://issues.apache.org/jira/browse/HIVE-1643
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: John Sichi

 HIVE-1226 added support for WHERE rowkey=3.  We would like to support WHERE 
 rowkey BETWEEN 10 and 20, as well as predicates on non-rowkeys (plus 
 conjunctions etc).  Non-rowkey conditions can't be used to filter out entire 
 ranges, but they can be used to push the per-row filter processing as far 
 down as possible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-19 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Attachment: HIVE-2020-3.patch

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020-2.patch, HIVE-2020-3.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-19 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Status: Patch Available  (was: Open)

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020-2.patch, HIVE-2020-3.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-19 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Description: 
Support for variable substitution was added in HIVE-1096. However, variable 
substitution was implemented by reusing the HiveConf namespace, so there is no 
separation between Hive configuration properties and Hive variables.

This ticket encompasses the following enhancements:
* Create a separate namespace for managing Hive variables.
* Add support for setting variables on the command line via '-hivevar x=y'
* Add support for setting variables through the CLI via 'set hivevar:x=y'
* Add support for referencing variables in statements using either 
'${hivevar:var_name}' or '${var_name}'
* Provide a means for differentiating between hiveconf, hivevar, system, and 
environment properties in the output of 'set -v'



  was:
Support for variable substitution was added in HIVE-1096. However, variable 
substitution was implemented by reusing the HiveConf namespace, so there is no 
separation between Hive configuration properties and Hive variables.

This ticket encompasses the following enhancements:
* Create a separate namespace for managing Hive variables.
* Add support for setting variables on the command line via '-hivevar x=y'
* Add support for setting variables through the CLI via 'var x=y'
* Add support for referencing variables in statements using either 
'${hivevar:var_name}' or '${var_name}'
* Provide a means for differentiating between hiveconf, hivevar, system, and 
environment properties in the output of 'set -v'




 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020-2.patch, HIVE-2020-3.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'set hivevar:x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2357) Support connection timeout in hive JDBC

2011-08-08 Thread Vaibhav Aggarwal (JIRA)
Support connection timeout in hive JDBC
---

 Key: HIVE-2357
 URL: https://issues.apache.org/jira/browse/HIVE-2357
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-08 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Attachment: HIVE-2020-2.patch

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020-2.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-08 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081267#comment-13081267
 ] 

Vaibhav Aggarwal commented on HIVE-2020:


New review request: https://reviews.apache.org/r/1324/

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020-2.patch, HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2020) Create a separate namespace for Hive variables

2011-08-05 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080275#comment-13080275
 ] 

Vaibhav Aggarwal commented on HIVE-2020:


I had a chat with Carl about this issue.
The following are the planned next steps:

1. Use VariableSubstitution instead of DefaultPreprocessor.
2. Add support for specifying variables as '${var_name}' only for now. (Already 
implemented)
3. Support set -v to clearly separate hive variables from hiveconf variables.
4. Support setting variables through command line as '-d x=y' OR '--define x=y' 
(Already implemented)

Thanks
Vaibhav

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2318) Support multiple file systems

2011-08-04 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079604#comment-13079604
 ] 

Vaibhav Aggarwal commented on HIVE-2318:


@Carl
You would notice that 70% of the code deals with 

1. Supporting reading with one file system and writing to another in the same 
query.
2. Writing directly to result directory if the file system does not support 
move.

S3FileSystem serves as a specific example in this case because of which I 
choose this title.

 Support multiple file systems
 -

 Key: HIVE-2318
 URL: https://issues.apache.org/jira/browse/HIVE-2318
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2318.patch


 Currently some of the Hive tasks like MoveTask, ConditionalMergeResolver 
 assume that the data is being copied or moved on the same file system.
 These operators file if the source table is in one filesystem (like HDFS) and 
 destination table is in another file system (like s3).
 This patch aims at:
 1. Support moving data between different file systems.
 2. Add support for file systems which do not support 'move' operation like s3.
 3. Remove redundant operations like moving data from and to the same location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2318) Support multiple file systems

2011-08-04 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079606#comment-13079606
 ] 

Vaibhav Aggarwal commented on HIVE-2318:


I am thinking of writing some unit tests testing individual methods in order to 
simplify testing.
What do you think?

 Support multiple file systems
 -

 Key: HIVE-2318
 URL: https://issues.apache.org/jira/browse/HIVE-2318
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2318.patch


 Currently some of the Hive tasks like MoveTask, ConditionalMergeResolver 
 assume that the data is being copied or moved on the same file system.
 These operators file if the source table is in one filesystem (like HDFS) and 
 destination table is in another file system (like s3).
 This patch aims at:
 1. Support moving data between different file systems.
 2. Add support for file systems which do not support 'move' operation like s3.
 3. Remove redundant operations like moving data from and to the same location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-03 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Status: Patch Available  (was: Open)

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298-3.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-03 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Attachment: HIVE-2298-3.patch

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298-3.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-03 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13079128#comment-13079128
 ] 

Vaibhav Aggarwal commented on HIVE-2298:


Hi Amreshwari

Thanks for reporting the diff.
I generated a new patch.
Hopefully this fixes the issue.

Vaibhav

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298-3.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-02 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13078490#comment-13078490
 ] 

Vaibhav Aggarwal commented on HIVE-2298:


Amreshwari, could you please post the diff or failure cause.
The test succeeded on my desktop.

I ran the following command:

ant test -Dtestcase=TestCliDriver -Dqfile=udf_percentile.q

Is there a different command I should be using to run just this particular test?

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-01 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Status: Patch Available  (was: Open)

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-01 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Attachment: HIVE-2298-2.patch

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-08-01 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13075989#comment-13075989
 ] 

Vaibhav Aggarwal commented on HIVE-2298:


Also added a test case.

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298-2.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2266) Fix compression parameters

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2266:
---

Status: Patch Available  (was: Open)

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2266) Fix compression parameters

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2266:
---

Attachment: HIVE-2266.patch

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Attachment: HIVE-2020.patch

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2318) Support multiple file systems

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2318:
---

Status: Patch Available  (was: Open)

 Support multiple file systems
 -

 Key: HIVE-2318
 URL: https://issues.apache.org/jira/browse/HIVE-2318
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2318.patch


 Currently some of the Hive tasks like MoveTask, ConditionalMergeResolver 
 assume that the data is being copied or moved on the same file system.
 These operators file if the source table is in one filesystem (like HDFS) and 
 destination table is in another file system (like s3).
 This patch aims at:
 1. Support moving data between different file systems.
 2. Add support for file systems which do not support 'move' operation like s3.
 3. Remove redundant operations like moving data from and to the same location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2266) Fix compression parameters

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13072549#comment-13072549
 ] 

Vaibhav Aggarwal commented on HIVE-2266:


Carl, I have attached the patch.

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2318) Support multiple file systems

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2318:
---

Attachment: HIVE-2318.patch

 Support multiple file systems
 -

 Key: HIVE-2318
 URL: https://issues.apache.org/jira/browse/HIVE-2318
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2318.patch


 Currently some of the Hive tasks like MoveTask, ConditionalMergeResolver 
 assume that the data is being copied or moved on the same file system.
 These operators file if the source table is in one filesystem (like HDFS) and 
 destination table is in another file system (like s3).
 This patch aims at:
 1. Support moving data between different file systems.
 2. Add support for file systems which do not support 'move' operation like s3.
 3. Remove redundant operations like moving data from and to the same location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2020) Create a separate namespace for Hive variables

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2020:
---

Status: Patch Available  (was: Open)

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2020.patch


 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-28 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Attachment: HIVE-2259-2.patch

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2259-2.patch, HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2020) Create a separate namespace for Hive variables

2011-07-27 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal reassigned HIVE-2020:
--

Assignee: Vaibhav Aggarwal

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach
Assignee: Vaibhav Aggarwal

 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2195) Hive queries hangs with first stage job created with zero mappers and 1 reducer,

2011-07-27 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal resolved HIVE-2195.


Resolution: Not A Problem

 Hive queries hangs with first stage job created with zero mappers and 1 
 reducer,  
 --

 Key: HIVE-2195
 URL: https://issues.apache.org/jira/browse/HIVE-2195
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: vitthal (Suhas) Gogate
Assignee: Vaibhav Aggarwal

 This happens when query aggregate data w/ predicate selecting bunch of 
 non-existing data partitions. 
 e,g, Table XXX has five columns,  A (int), B (int), C(string), date (int) and 
 hour (int), where date/hour are the partition columns, 
 select cast((100*date+hour) as BIGINT) as datehour, sum(A) as sumA,
 sum(B) as sumB from XXX where date=20110925 and C='test' group by date, hour 
 order by datehour  
 In the above query, make a note that selected date partition range does not 
 exists in hive table i.e. no date partitions for date=20110925
 The above query hangs with the first map reduce job it creates, with zero 
 mappers and 1 reducer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2318) Support multiple file systems

2011-07-27 Thread Vaibhav Aggarwal (JIRA)
Support multiple file systems
-

 Key: HIVE-2318
 URL: https://issues.apache.org/jira/browse/HIVE-2318
 Project: Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal


Currently some of the Hive tasks like MoveTask, ConditionalMergeResolver assume 
that the data is being copied or moved on the same file system.

These operators file if the source table is in one filesystem (like HDFS) and 
destination table is in another file system (like s3).

This patch aims at:

1. Support moving data between different file systems.
2. Add support for file systems which do not support 'move' operation like s3.
3. Remove redundant operations like moving data from and to the same location.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2020) Create a separate namespace for Hive variables

2011-07-26 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13071318#comment-13071318
 ] 

Vaibhav Aggarwal commented on HIVE-2020:


I propose to use -d, --define to define Hive variables.
Amazon ElasticMapreduce is already using this notation for hive variables and 
variable substitution.

This approach would also clearly separate use of -hiveconf from -d or --define 
which would be used to purely set hive variables.

This would also maintain consistency for Hive users.

 Create a separate namespace for Hive variables
 --

 Key: HIVE-2020
 URL: https://issues.apache.org/jira/browse/HIVE-2020
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Carl Steinbach

 Support for variable substitution was added in HIVE-1096. However, variable 
 substitution was implemented by reusing the HiveConf namespace, so there is 
 no separation between Hive configuration properties and Hive variables.
 This ticket encompasses the following enhancements:
 * Create a separate namespace for managing Hive variables.
 * Add support for setting variables on the command line via '-hivevar x=y'
 * Add support for setting variables through the CLI via 'var x=y'
 * Add support for referencing variables in statements using either 
 '${hivevar:var_name}' or '${var_name}'
 * Provide a means for differentiating between hiveconf, hivevar, system, and 
 environment properties in the output of 'set -v'

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-25 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070662#comment-13070662
 ] 

Vaibhav Aggarwal commented on HIVE-2297:


I generated the patch from 0.7 branch and then rebased it to 0.8.
Didn't realize that it was already fixed in 0.8 while generating the patch.
I will resolve this.

Thanks
Vaibhav

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2297.patch, fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-25 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Resolution: Not A Problem
Status: Resolved  (was: Patch Available)

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2297.patch, fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-25 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070800#comment-13070800
 ] 

Vaibhav Aggarwal commented on HIVE-2299:


Thanks for looking at this improvement request Carl!

 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.8.0

 Attachments: HIVE-2299.patch


 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-25 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070814#comment-13070814
 ] 

Vaibhav Aggarwal commented on HIVE-2298:


I will make the style changes and try to add a test case to test this specific 
case.

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.7.0
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Attachment: fix_npe.patch

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069134#comment-13069134
 ] 

Vaibhav Aggarwal commented on HIVE-2297:


Some of the file systems can return null if there are no objects to list.
Added a fix for that.

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Fix NPE in ConditionalResolverSkewJoin
--

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Status: Patch Available  (was: Open)

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Attachment: HIVE-2298.patch

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Fix UDAFPercentile to tolerate null percentiles
---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch

UDAFPercentile when passed null percentile list will throw a null pointer 
exception.
Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Status: Patch Available  (was: Open)

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Optimize Hive query startup time for multiple partitions


 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal


Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O(n) operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Description: 
Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O n operation.

  was:
Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O(n) operation.


 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal

 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Attachment: HIVE-2297.patch

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2297.patch, fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Attachment: HIVE-2299.patch

 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
 Attachments: HIVE-2299.patch


 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Assignee: Vaibhav Aggarwal
  Status: Patch Available  (was: Open)

 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2299.patch


 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-1604) Patch to allow variables in Hive

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal reassigned HIVE-1604:
--

Assignee: Vaibhav Aggarwal

 Patch to allow variables in Hive
 

 Key: HIVE-1604
 URL: https://issues.apache.org/jira/browse/HIVE-1604
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1604.patch


 Patch to Hive which allows command line substitution.
 The patch modifies the Hive command line driver and options processor to 
 support the following arguments:
 hive  [-d key=value] [-define key=value] 
   -dSubsitution to apply to script
   -define   Subsitution to apply to script

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2195) Hive queries hangs with first stage job created with zero mappers and 1 reducer,

2011-07-05 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal reassigned HIVE-2195:
--

Assignee: Vaibhav Aggarwal

 Hive queries hangs with first stage job created with zero mappers and 1 
 reducer,  
 --

 Key: HIVE-2195
 URL: https://issues.apache.org/jira/browse/HIVE-2195
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: vitthal (Suhas) Gogate
Assignee: Vaibhav Aggarwal

 This happens when query aggregate data w/ predicate selecting bunch of 
 non-existing data partitions. 
 e,g, Table XXX has five columns,  A (int), B (int), C(string), date (int) and 
 hour (int), where date/hour are the partition columns, 
 select cast((100*date+hour) as BIGINT) as datehour, sum(A) as sumA,
 sum(B) as sumB from XXX where date=20110925 and C='test' group by date, hour 
 order by datehour  
 In the above query, make a note that selected date partition range does not 
 exists in hive table i.e. no date partitions for date=20110925
 The above query hangs with the first map reduce job it creates, with zero 
 mappers and 1 reducer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2258) Honor -S silent floag during hadoop rmr command

2011-07-05 Thread Vaibhav Aggarwal (JIRA)
Honor -S silent floag during hadoop rmr command
---

 Key: HIVE-2258
 URL: https://issues.apache.org/jira/browse/HIVE-2258
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2258) Honor -S flag during hadoop rmr command

2011-07-05 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2258:
---

Description: 
Currently even if -S flag is specified, the output of hadoop -rmr command is 
printed to the screen.
The reason is that the command writes output to screen instead of log file.

I have fixed the problem by temporarily redirecting the output for that command.
Summary: Honor -S flag during hadoop rmr command  (was: Honor -S silent 
floag during hadoop rmr command)

 Honor -S flag during hadoop rmr command
 ---

 Key: HIVE-2258
 URL: https://issues.apache.org/jira/browse/HIVE-2258
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal

 Currently even if -S flag is specified, the output of hadoop -rmr command is 
 printed to the screen.
 The reason is that the command writes output to screen instead of log file.
 I have fixed the problem by temporarily redirecting the output for that 
 command.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2259) Skip comments in hive script

2011-07-05 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2259:
---

Attachment: HIVE-2259.patch

 Skip comments in hive script
 

 Key: HIVE-2259
 URL: https://issues.apache.org/jira/browse/HIVE-2259
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2259.patch


 If you specify something like:
 -- This is a comment
 add jar jar_path;
 select * from my_table;
 This fails.
 I have created a fix to skip the commented lines.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2150) Sampling fails after dynamic-partition insert into a bucketed s3n table

2011-07-01 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal reassigned HIVE-2150:
--

Assignee: Vaibhav Aggarwal

 Sampling fails after dynamic-partition insert into a bucketed s3n table
 ---

 Key: HIVE-2150
 URL: https://issues.apache.org/jira/browse/HIVE-2150
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Steven Wong
Assignee: Vaibhav Aggarwal

 When using dynamic-partition insert and bucketing together on an s3n table, 
 the insert does not create files for empty buckets. This will result in the 
 following exception when running a sampling query that includes the empty 
 buckets.
 {noformat}
 FAILED: Hive Internal Error: java.lang.RuntimeException(Cannot get bucket 
 path for bucket 1)
 java.lang.RuntimeException: Cannot get bucket path for bucket 1
   at 
 org.apache.hadoop.hive.ql.metadata.Partition.getBucketPath(Partition.java:367)
   at 
 org.apache.hadoop.hive.ql.optimizer.SamplePruner.prune(SamplePruner.java:186)
   at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:603)
   at 
 org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:514)
   at 
 org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1.processFS(GenMRFileSink1.java:586)
   at 
 org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1.process(GenMRFileSink1.java:145)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
   at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:55)
   at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
   at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
   at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
   at 
 org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:6336)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6615)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:332)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:686)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:149)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLineInternal(CliDriver.java:228)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:209)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:355)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
   at 
 org.apache.hadoop.hive.ql.metadata.Partition.getBucketPath(Partition.java:365)
   ... 27 more
 {noformat}
 Here is a repro case:
 {noformat}
 CREATE TABLE tab
 (x string)
 PARTITIONED BY (p1 string, p2 string)
 CLUSTERED BY (x) INTO 4 BUCKETS
 LOCATION 's3n://some/path';
 SET hive.exec.dynamic.partition=true;
 SET hive.enforce.bucketing=true;
 INSERT OVERWRITE TABLE tab
 PARTITION (p1='p', p2)
 SELECT 'v1', 'v2'
 FROM dual;
 SELECT *
 FROM tab TABLESAMPLE (BUCKET 2 OUT OF 4);
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2086) Data loss with external table

2011-06-24 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054550#comment-13054550
 ] 

Vaibhav Aggarwal commented on HIVE-2086:


Has this patch been committed or is anyone still working on this particular 
patch?

 Data loss with external table
 -

 Key: HIVE-2086
 URL: https://issues.apache.org/jira/browse/HIVE-2086
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
 Environment: Amazon  elastics mapreduce cluster
Reporter: Q Long
Assignee: Jonathan Natkins
 Attachments: HIVE-2086.1.patch, HIVE-2086.2.patch, create_like.q.out


 Data loss when using create external table like statement. 
 1) Set up an external table S, point to location L. Populate data in S.
 2) Create another external table T, using statement like this:
 create external table T like S location L
Make sure table T point to the same location as the original table S.
 3) Query table T, see the same set of data in S.
 4) drop table T.
 5) Query table S will return nothing, and location L is deleted. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (HIVE-1790) Patch to support HAVING clause in Hive

2010-12-20 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-1790:
---

Attachment: HIVE-1790.4.patch.txt

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790-1.patch, HIVE-1790.2.patch.txt, 
 HIVE-1790.3.patch.txt, HIVE-1790.4.patch.txt, HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1790) Patch to support HAVING clause in Hive

2010-12-20 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12973463#action_12973463
 ] 

Vaibhav Aggarwal commented on HIVE-1790:


Please note that I have not created a common method for genFilterPlan and 
genHavingPlan.
The genHavingPlan needs to populate the input row resolver with additional 
information related to column aliases.
It is not happening in genFilterPlan.
I don't think that there is lot of common code between the two methods. In this 
case I would prefer separate methods over if statements in code. It keeps the 
control flow much cleaner.

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790-1.patch, HIVE-1790.2.patch.txt, 
 HIVE-1790.3.patch.txt, HIVE-1790.4.patch.txt, HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-15 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932211#action_12932211
 ] 

Vaibhav Aggarwal commented on HIVE-1790:


Hi

I think the patch is ready for review.
This is my first attempt though.

Thanks
Vaibhav

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-15 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932275#action_12932275
 ] 

Vaibhav Aggarwal commented on HIVE-1790:


Hey Carl

Do you know how to access the schema and data in the staging tables?
I am not able to find the table src anywhere in the hive codebase.

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-15 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-1790:
---

Attachment: HIVE-1790-1.patch

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790-1.patch, HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-15 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12932318#action_12932318
 ] 

Vaibhav Aggarwal commented on HIVE-1790:


I have added test cases and have uploaded a new patch.

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Fix For: 0.7.0

 Attachments: HIVE-1790-1.patch, HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-12 Thread Vaibhav Aggarwal (JIRA)
Patch to support HAVING clause in Hive
--

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal


Currently Hive users have to do nested queries in order to apply filter on 
group by expressions.
This patch allows users to directly apply filter on group by expressions by 
using HAVING clause.
This patch also helps us integrate Hive with other data analysis tools which 
rely on HAVING expression.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1790) Patch to support HAVING clause in Hive

2010-11-12 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-1790:
---

Attachment: HIVE-1790.patch

 Patch to support HAVING clause in Hive
 --

 Key: HIVE-1790
 URL: https://issues.apache.org/jira/browse/HIVE-1790
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1790.patch


 Currently Hive users have to do nested queries in order to apply filter on 
 group by expressions.
 This patch allows users to directly apply filter on group by expressions by 
 using HAVING clause.
 This patch also helps us integrate Hive with other data analysis tools which 
 rely on HAVING expression.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.