[
https://issues.apache.org/jira/browse/HIVE-27133?focusedWorklogId=851145&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851145
]
ASF GitHub Bot logged work on HIVE-27133:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 15/Mar/23 13:26
Start Date: 15/Mar/23 13:26
Worklog Time Spent: 10m
Work Description: SourabhBadhya commented on code in PR #4110:
URL: https://github.com/apache/hive/pull/4110#discussion_r1137062590
##########
ql/src/test/queries/clientpositive/limit_max_int.q:
##########
@@ -0,0 +1,6 @@
+--! qt:dataset:src
+select key from src limit 214748364700;
+select key from src where key = '238' limit 214748364700;
+select * from src where key = '238' limit 214748364700;
+select src.key, count(src.value) from src group by src.key limit 214748364700;
+select * from ( select key from src limit 3) sq1 limit 214748364700;
Review Comment:
nit: Please add a newline at the end of the qfile.
##########
common/src/java/org/apache/hive/common/util/HiveStringUtils.java:
##########
@@ -1174,4 +1175,25 @@ private static boolean isComment(String line) {
return lineTrimmed.startsWith("#") || lineTrimmed.startsWith("--");
}
+ /**
+ * Returns integer value of a string. If the string value exceeds max int,
returns Integer.MAX_VALUE
+ * else if the string value is less than min int, returns Integer.MIN_VALUE
+ *
+ * @param value value of the input string
+ * @return integer
+ */
+ public static int convertStringToBoundedInt(String value) {
+ try {
+ BigInteger bigIntValue = new BigInteger(value);
+ if (bigIntValue.compareTo(BigInteger.valueOf(Integer.MAX_VALUE)) > 0) {
+ return Integer.MAX_VALUE;
Review Comment:
@vamshikolanu
I agree with @jfsii. Converting a large number to Integer.MAX_VALUE is
misleading the user.
Consider the following query -
`INSERT INTO TABLE destinationTable SELECT * FROM sourceTable LIMIT
<some_large_number>;`
The insert will write records based on the output of the SELECT operator. In
this case, since we have converted it to Integer.MAX_VALUE, the number of
records written will be equal to Integer.MAX_VALUE which might not be what the
user wants.
Perhaps adding a meaningful exception is better. In the long term, adding
support for large integers for LIMIT clauses is even more better.
Issue Time Tracking
-------------------
Worklog Id: (was: 851145)
Time Spent: 1h (was: 50m)
> Round off limit value greater than int_max to int_max;
> ------------------------------------------------------
>
> Key: HIVE-27133
> URL: https://issues.apache.org/jira/browse/HIVE-27133
> Project: Hive
> Issue Type: Task
> Reporter: vamshi kolanu
> Assignee: vamshi kolanu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> Currently when the limit has a bigint value, it fails with the following
> error. As part of this task, we are going to round off any value greater than
> int_max to int_max.
> select string_col from alltypes order by 1 limit 9223372036854775807
>
> java.lang.NumberFormatException: For input string: "9223372036854775807"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:583)
> at java.lang.Integer.<init>(Integer.java:867)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1803)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1911)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.doPhase1(SemanticAnalyzer.java:1911)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:12616)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12718)
> at
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:450)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:299)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:650)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1503)
> at
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1450)
> at
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1445)
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
> at
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:200)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:265)
> at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:274)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:565)
> at
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:551)
> at
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:567)
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
> at
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)