[jira] [Updated] (HIVE-27669) Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX

2023-09-04 Thread Harshal Patel (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshal Patel updated HIVE-27669:
-
Status: Patch Available  (was: In Progress)

https://github.com/apache/hive/pull/4665

> Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX
> 
>
> Key: HIVE-27669
> URL: https://issues.apache.org/jira/browse/HIVE-27669
> Project: Hive
>  Issue Type: Bug
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>
> * If a Table is created using CTAS with rows > INT_MAX then beeline eats up 
> the thrown error
>  *  As replication also uses the same infra it should also do the same 
> instead of failing with NumberFormatException
> *Note:*  This is happening in the customer's environment consistently but we 
> are not able to reproduce it. So, we have gone through the whole code flow 
> and handled the error accordingly.
>  
> Error message while incremental replication:
> {code:java}
> 4:12:03.230 PMINFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10066:REPL_STATE_LOG] in serial mode4:12:03.231 PMINFO
> ReplState   [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: REPL::EVENT_LOAD: 
> {"dbName":"sample","eventId":"50442182","eventType":"EVENT_ALLOC_WRITE_ID","eventsLoadProgress":"2443/20424","loadTime":1687187523,"eventDuration":"159
>  ms"}4:12:03.231 PM   INFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10067:COLUMNSTATS] in serial mode4:12:03.488 PM   INFODriver  
> [Scheduled Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
> Starting task [Stage-10068:DEPENDENCY_COLLECTION] in serial mode4:12:03.488 
> PM INFODriver  [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: Starting task [Stage-10069:DDL] in serial 
> mode4:12:03.504 PM   INFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10070:REPL_STATE_LOG] in serial mode4:12:03.504 PMINFO
> ReplState   [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: REPL::EVENT_LOAD: 
> {"dbName":"sample","eventId":"50442183","eventType":"EVENT_UPDATE_TABLE_COL_STAT","eventsLoadProgress":"2444/20424","loadTime":1687187523,"eventDuration":"273
>  ms"}4:12:03.504 PMINFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10071:DDL] in serial mode4:12:03.596 PM   ERROR   Task
> [Scheduled Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
> Failedorg.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter 
> table. java.lang.NumberFormatException: For input string: "5744479373" at 
> org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:854) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableReplaceMode(CreateTableOperation.java:127)
>  ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:90)
>  ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:82) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
> org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
> org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:772) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:511) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:505) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> 

[jira] [Updated] (HIVE-21213) Acid table bootstrap replication needs to handle directory created by compaction with txn id

2023-09-04 Thread Teddy Choi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-21213:
--
Release Note: Merged. Thanks.
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

> Acid table bootstrap replication needs to handle directory created by 
> compaction with txn id
> 
>
> Key: HIVE-21213
> URL: https://issues.apache.org/jira/browse/HIVE-21213
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, repl
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21213.01.patch, HIVE-21213.02.patch, 
> HIVE-21213.03.patch, HIVE-21213.04.patch, HIVE-21213.05.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The current implementation of compaction uses the txn id in the directory 
> name. This is used to isolate the queries from reading the directory until 
> compaction has finished and to avoid the compactor marking used earlier. In 
> case of replication, during bootstrap , directory is copied as it is with the 
> same name from source to destination cluster. But the directory created by 
> compaction with txn id can not be copied as the txn list at target may be 
> different from source. The txn id which is valid at source may be an aborted 
> txn at target. So conversion logic is required to create a new directory with 
> valid txn at target and dump the data to the newly created directory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27669) Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX

2023-09-04 Thread Harshal Patel (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshal Patel updated HIVE-27669:
-
Description: 
* If a Table is created using CTAS with rows > INT_MAX then beeline eats up the 
thrown error
 *  As replication also uses the same infra it should also do the same instead 
of failing with NumberFormatException

*Note:*  This is happening in the customer's environment consistently but we 
are not able to reproduce it. So, we have gone through the whole code flow and 
handled the error accordingly.

 

Error message while incremental replication:
{code:java}
4:12:03.230 PM  INFODriver  [Scheduled Query 
Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
[Stage-10066:REPL_STATE_LOG] in serial mode4:12:03.231 PMINFOReplState  
 [Scheduled Query Executor(schedule:repl_sample_acid_1, 
execution_id:49625)]: REPL::EVENT_LOAD: 
{"dbName":"sample","eventId":"50442182","eventType":"EVENT_ALLOC_WRITE_ID","eventsLoadProgress":"2443/20424","loadTime":1687187523,"eventDuration":"159
 ms"}4:12:03.231 PM   INFODriver  [Scheduled Query 
Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
[Stage-10067:COLUMNSTATS] in serial mode4:12:03.488 PM   INFODriver  
[Scheduled Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
Starting task [Stage-10068:DEPENDENCY_COLLECTION] in serial mode4:12:03.488 PM  
   INFODriver  [Scheduled Query Executor(schedule:repl_sample_acid_1, 
execution_id:49625)]: Starting task [Stage-10069:DDL] in serial mode4:12:03.504 
PM   INFODriver  [Scheduled Query Executor(schedule:repl_sample_acid_1, 
execution_id:49625)]: Starting task [Stage-10070:REPL_STATE_LOG] in serial 
mode4:12:03.504 PMINFOReplState   [Scheduled Query 
Executor(schedule:repl_sample_acid_1, execution_id:49625)]: REPL::EVENT_LOAD: 
{"dbName":"sample","eventId":"50442183","eventType":"EVENT_UPDATE_TABLE_COL_STAT","eventsLoadProgress":"2444/20424","loadTime":1687187523,"eventDuration":"273
 ms"}4:12:03.504 PMINFODriver  [Scheduled Query 
Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
[Stage-10071:DDL] in serial mode4:12:03.596 PM   ERROR   Task[Scheduled 
Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
Failedorg.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. 
java.lang.NumberFormatException: For input string: "5744479373" at 
org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:854) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableReplaceMode(CreateTableOperation.java:127)
 ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:90)
 ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:82) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:772) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:511) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:505) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.processQuery(ScheduledQueryExecutionService.java:240)
 ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5] at 
org.apache.hadoop.hive.ql.scheduled.ScheduledQueryExecutionService$ScheduledQueryExecutor.run(ScheduledQueryExecutionService.java:193)
 

[jira] [Work started] (HIVE-27669) Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX

2023-09-04 Thread Harshal Patel (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-27669 started by Harshal Patel.

> Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX
> 
>
> Key: HIVE-27669
> URL: https://issues.apache.org/jira/browse/HIVE-27669
> Project: Hive
>  Issue Type: Bug
>Reporter: Harshal Patel
>Assignee: Harshal Patel
>Priority: Major
>
> * If a Table is created using CTAS with rows > INT_MAX then beeline eats up 
> the thrown error
>  *  As replication also uses the same infra it should also do the same 
> instead of failing with NumberFormatException
> *Note:*  This is happening in the customer's environment consistently but we 
> are not able to reproduce it. So, we have gone through the whole code flow 
> and handled the error accordingly.
>  
> Error message while incremental replication:
> {code:java}
> 4:12:03.230 PMINFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10066:REPL_STATE_LOG] in serial mode4:12:03.231 PMINFO
> ReplState   [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: REPL::EVENT_LOAD: 
> {"dbName":"sample","eventId":"50442182","eventType":"EVENT_ALLOC_WRITE_ID","eventsLoadProgress":"2443/20424","loadTime":1687187523,"eventDuration":"159
>  ms"}4:12:03.231 PM   INFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10067:COLUMNSTATS] in serial mode4:12:03.488 PM   INFODriver  
> [Scheduled Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
> Starting task [Stage-10068:DEPENDENCY_COLLECTION] in serial mode4:12:03.488 
> PM INFODriver  [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: Starting task [Stage-10069:DDL] in serial 
> mode4:12:03.504 PM   INFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10070:REPL_STATE_LOG] in serial mode4:12:03.504 PMINFO
> ReplState   [Scheduled Query Executor(schedule:repl_sample_acid_1, 
> execution_id:49625)]: REPL::EVENT_LOAD: 
> {"dbName":"sample","eventId":"50442183","eventType":"EVENT_UPDATE_TABLE_COL_STAT","eventsLoadProgress":"2444/20424","loadTime":1687187523,"eventDuration":"273
>  ms"}4:12:03.504 PMINFODriver  [Scheduled Query 
> Executor(schedule:repl_sample_acid_1, execution_id:49625)]: Starting task 
> [Stage-10071:DDL] in serial mode4:12:03.596 PM   ERROR   Task
> [Scheduled Query Executor(schedule:repl_sample_acid_1, execution_id:49625)]: 
> Failedorg.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter 
> table. java.lang.NumberFormatException: For input string: "5744479373" at 
> org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:854) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.createTableReplaceMode(CreateTableOperation.java:127)
>  ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:90)
>  ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:82) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
> org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]at 
> org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]  at 
> org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:772) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:511) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.Driver.run(Driver.java:505) 
> ~[hive-exec-3.1.3000.7.1.8.15-5.jar:3.1.3000.7.1.8.15-5]   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> 

[jira] [Created] (HIVE-27669) Hive Acid CTAS fails incremental if no of rows inserted is > INT_MAX

2023-09-04 Thread Harshal Patel (Jira)
Harshal Patel created HIVE-27669:


 Summary: Hive Acid CTAS fails incremental if no of rows inserted 
is > INT_MAX
 Key: HIVE-27669
 URL: https://issues.apache.org/jira/browse/HIVE-27669
 Project: Hive
  Issue Type: Bug
Reporter: Harshal Patel
Assignee: Harshal Patel


* If a Table is created using CTAS with rows > INT_MAX then beeline eats up the 
thrown error
 *  As replication also uses the same infra it should also do the same instead 
of failing with NumberFormatException

*Note:*  This is happening in the customer's environment consistently but we 
are not able to reproduce it. So, we have gone through the whole code flow and 
handled the error accordingly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27664) AlterTableSetLocationAnalyzer threw a confusing exception "Cannot connect to namenode"

2023-09-04 Thread xiongyinke (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761913#comment-17761913
 ] 

xiongyinke commented on HIVE-27664:
---

[~daijy] Hi daijy ,could you help me take a look at this?  
The PR is  https://github.com/apache/hive/pull/4651 . 
Best wishes!

> AlterTableSetLocationAnalyzer threw a confusing exception "Cannot connect to 
> namenode"
> --
>
> Key: HIVE-27664
> URL: https://issues.apache.org/jira/browse/HIVE-27664
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0-beta-1
>Reporter: xiongyinke
>Assignee: xiongyinke
>Priority: Major
>
> @Override
> protected void analyzeCommand(TableName tableName, Map 
> partitionSpec, ASTNode command)
> throws SemanticException {
> String newLocation = unescapeSQLString(command.getChild(0).getText());
> try {
> // To make sure host/port pair is valid, the status of the location does not 
> matter
> FileSystem.get(new URI(newLocation), conf).getFileStatus(new 
> Path(newLocation));
> } catch (FileNotFoundException e) {
> // Only check host/port pair is valid, whether the file exist or not does not 
> matter
> } catch (Exception e) {
> throw new SemanticException("Cannot connect to namenode, please check if 
> host/port pair for " + newLocation +
> " is valid", e);
> }
> When the
> "FileSystem.get(new URI(newLocation), conf).getFileStatus(new 
> Path(newLocation))"
> code throws a "Permission denied" exception, the Beeline client will receive 
> the confusing exception "Cannot connect to namenode, please check if 
> host/port pair for". In reality, the issue is not with the namenode.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table4` as
select
  'bob' as name,
  map(
      "Map_Key1",
        named_struct(
            'Id',
            'Id_Value1',
            'Name',
            'Name_Value1'
        ),
      "Map_Key2",
        named_struct(
            'Id',
            'Id_Value2',
            'Name',
            'Name_Value2'
        )
  ) as testmarks;

select * from table4;

set hive.vectorized.execution.enabled=false;

select * from table4;
{code}
Output of 1st select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
Output of 2nd select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.

 

*To reproduce this issue:*

*mvn test -Dtest=TestCliDriver -Pitests -Dqfile=`qfile_name`-pl itests/qtest 
-Dtest.output.overwrite*

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table4` as
select
  'bob' as name,
  map(
      "Map_Key1",
        named_struct(
            'Id',
            'Id_Value1',
            'Name',
            'Name_Value1'
        ),
      "Map_Key2",
        named_struct(
            'Id',
            'Id_Value2',
            'Name',
            'Name_Value2'
        )
  ) as testmarks;

select * from table4;

set hive.vectorized.execution.enabled=false;

select * from table4;
{code}
Output of 1st select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
Output of 2nd select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
>  
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table4` as
> select
>   'bob' as name,
>   map(
>       "Map_Key1",
>         named_struct(
>             'Id',
>             'Id_Value1',
>             'Name',
>             'Name_Value1'
>         ),
>       "Map_Key2",
>         named_struct(
>             'Id',
>             'Id_Value2',
>             'Name',
>             'Name_Value2'
>         )
>   ) as testmarks;
> select * from table4;
> set hive.vectorized.execution.enabled=false;
> select * from table4;
> {code}
> Output of 1st select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
> Output of 2nd select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
>  
> MAP Complex type is not handling the scenario where it contains a nested 
> complex type like STRUCT, ARRAY, UNION.
>  
> *To reproduce this issue:*
> *mvn test -Dtest=TestCliDriver -Pitests -Dqfile=`qfile_name`-pl itests/qtest 

[jira] [Resolved] (HIVE-27605) Backport of HIVE-19661 : switch Hive UDFs to use Re2J regex engine

2023-09-04 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27605.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-19661 : switch Hive UDFs to use Re2J regex engine
> --
>
> Key: HIVE-27605
> URL: https://issues.apache.org/jira/browse/HIVE-27605
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table4` as
select
  'bob' as name,
  map(
      "Map_Key1",
        named_struct(
            'Id',
            'Id_Value1',
            'Name',
            'Name_Value1'
        ),
      "Map_Key2",
        named_struct(
            'Id',
            'Id_Value2',
            'Name',
            'Name_Value2'
        )
  ) as testmarks;

select * from table4;

set hive.vectorized.execution.enabled=false;

select * from table4;
{code}
Output of 1st select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
Output of 2nd select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table4` as
select
  'bob' as name,
  map(
      "Map_Key1",
        named_struct(
            'Id',
            'Id_Value1',
            'Name',
            'Name_Value1'
        ),
      "Map_Key2",
        named_struct(
            'Id',
            'Id_Value2',
            'Name',
            'Name_Value2'
        )
  ) as testmarks;

select * from table4;

set hive.vectorized.execution.enabled=false;

select * from table4;
{code}
Output of 1st select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
Output of 2nd select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
>  
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table4` as
> select
>   'bob' as name,
>   map(
>       "Map_Key1",
>         named_struct(
>             'Id',
>             'Id_Value1',
>             'Name',
>             'Name_Value1'
>         ),
>       "Map_Key2",
>         named_struct(
>             'Id',
>             'Id_Value2',
>             'Name',
>             'Name_Value2'
>         )
>   ) as testmarks;
> select * from table4;
> set hive.vectorized.execution.enabled=false;
> select * from table4;
> {code}
> Output of 1st select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
> Output of 2nd select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
>  
> MAP Complex type is not handling the scenario where it contains a nested 
> complex type like STRUCT, ARRAY, UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;
create EXTERNAL table `table4` as
select
  'bob' as name,
  map(
      "Map_Key1",
        named_struct(
            'Id',
            'Id_Value1',
            'Name',
            'Name_Value1'
        ),
      "Map_Key2",
        named_struct(
            'Id',
            'Id_Value2',
            'Name',
            'Name_Value2'
        )
  ) as testmarks;

select * from table4;

set hive.vectorized.execution.enabled=false;

select * from table4;
{code}
Output of 1st select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
Output of 2nd select statement:
{code:java}
bob·    
{"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob' as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  ) as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
>  
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table4` as
> select
>   'bob' as name,
>   map(
>       "Map_Key1",
>         named_struct(
>             'Id',
>             'Id_Value1',
>             'Name',
>             'Name_Value1'
>         ),
>       "Map_Key2",
>         named_struct(
>             'Id',
>             'Id_Value2',
>             'Name',
>             'Name_Value2'
>         )
>   ) as testmarks;
> select * from table4;
> set hive.vectorized.execution.enabled=false;
> select * from table4;
> {code}
> Output of 1st select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1\u0004Name_Value1","name":null},"Map_Key2":{"id":"Id_Value2\u0004Name_Value2","name":null}}{code}
> Output of 2nd select statement:
> {code:java}
> bob·    
> {"Map_Key1":{"id":"Id_Value1","name":"Name_Value1"},"Map_Key2":{"id":"Id_Value2","name":"Name_Value2"}}{code}
>  
> MAP Complex type is not handling the scenario where it contains a nested 
> complex type like STRUCT, ARRAY, UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob' as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  ) as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 

MAP Complex type is not handling the scenario where it contains a nested 
complex type like STRUCT, ARRAY, UNION.

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob'                                           as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  )                                               as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table6` as
> select
>   'bob' as name,
>   MAP(
>     "Key1",
>     ARRAY(
>       1,
>       2,
>       3
>     ),
>     "Key2",
>     ARRAY(
>     4,
>     5,
>     6
>     )
>   ) as testmarks;
> select * from table6;
> set hive.vectorized.execution.enabled=false;
> select * from table6; {code}
> Output of 1st select statement:
> {code:java}
> bob·    {"Key1":null,"Key2":null} {code}
> Output of 2nd select statement:
> {code:java}
> bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
>  
> MAP Complex type is not handling the scenario where it contains a nested 
> complex type like STRUCT, ARRAY, UNION.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

Sample q file:

 
{code:java}
set hive.fetch.task.conversion=none;
set hive.vectorized.execution.enabled=true;

create EXTERNAL table `table6` as
select
  'bob'                                           as name,
  MAP(
    "Key1",
    ARRAY(
      1,
      2,
      3
    ),
    "Key2",
    ARRAY(
    4,
    5,
    6
    )
  )                                               as testmarks;

select * from table6;
set hive.vectorized.execution.enabled=false;
select * from table6; {code}
Output of 1st select statement:
{code:java}
bob·    {"Key1":null,"Key2":null} {code}
Output of 2nd select statement:
{code:java}
bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
 

  was:
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
> Sample q file:
>  
> {code:java}
> set hive.fetch.task.conversion=none;
> set hive.vectorized.execution.enabled=true;
> create EXTERNAL table `table6` as
> select
>   'bob'                                           as name,
>   MAP(
>     "Key1",
>     ARRAY(
>       1,
>       2,
>       3
>     ),
>     "Key2",
>     ARRAY(
>     4,
>     5,
>     6
>     )
>   )                                               as testmarks;
> select * from table6;
> set hive.vectorized.execution.enabled=false;
> select * from table6; {code}
> Output of 1st select statement:
> {code:java}
> bob·    {"Key1":null,"Key2":null} {code}
> Output of 2nd select statement:
> {code:java}
> bob·    {"Key1":[1,2,3],"Key2":[4,5,6]} {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Description: 
When reading a text table with vectorization on and hive.fetch.task.conversion 
as none, wrong parsing of delimiter is happening in nested complex types 
containing map. For example, if a columns schema is like: 
map then \u0004 char is coming in the 
output. Here is a example:

 

  was:When reading the data from text file format (with vectorizaton on) which 
contains multiple delimiter like ^A ^B ^C ^D etc i.e (\u0001, \u0002, \u0003, 
\u0004), incorrect parsing of data is happening which leads to incorrect 
result. 


> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading a text table with vectorization on and 
> hive.fetch.task.conversion as none, wrong parsing of delimiter is happening 
> in nested complex types containing map. For example, if a columns schema is 
> like: map then \u0004 char is coming in 
> the output. Here is a example:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27662) Incorrect parsing of nested complex types containing map during vectorized text processing

2023-09-04 Thread Raghav Aggarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghav Aggarwal updated HIVE-27662:
---
Summary: Incorrect parsing of nested complex types containing map during 
vectorized text processing  (was: Incorrect parsing of complex type during 
vectorized text processing of data having multiple delimiters)

> Incorrect parsing of nested complex types containing map during vectorized 
> text processing
> --
>
> Key: HIVE-27662
> URL: https://issues.apache.org/jira/browse/HIVE-27662
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Raghav Aggarwal
>Assignee: Raghav Aggarwal
>Priority: Major
>
> When reading the data from text file format (with vectorizaton on) which 
> contains multiple delimiter like ^A ^B ^C ^D etc i.e (\u0001, \u0002, \u0003, 
> \u0004), incorrect parsing of data is happening which leads to incorrect 
> result. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)