[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2021-05-07 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17341074#comment-17341074
 ] 

ASF subversion and git services commented on IMPALA-9967:
-

Commit e26543426c8b7cc86fc0d9f60c53c7a7fd7bc8a8 in impala's branch 
refs/heads/master from Zoltan Borok-Nagy
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e265434 ]

IMPALA-9967: Add support for reading ORC's TIMESTAMP WITH LOCAL TIMEZONE

ORC-189 and ORC-666 added support for a new timestamp type
'TIMESTMAP WITH LOCAL TIMEZONE' to the Orc library.

This patch adds support for reading such timestamps with Impala.
These are UTC-normalized timestamps, therefore we convert them
to local timezone during scanning.

Testing:
 * added test for CREATE TABLE LIKE ORC
 * added scanner tests to test_scanners.py

Change-Id: Icb0c6a43ebea21f1cba5b8f304db7c4bd43967d9
Reviewed-on: http://gerrit.cloudera.org:8080/17347
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: Sheng Wang
>Assignee: Zoltán Borók-Nagy
>Priority: Minor
>  Labels: impala-iceberg
> Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, 
> 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc
>
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-10-14 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214324#comment-17214324
 ] 

ASF subversion and git services commented on IMPALA-9967:
-

Commit 0c0985a825fba8d9702639e3e679d2e1b9070fe1 in impala's branch 
refs/heads/master from skyyws
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0c0985a ]

IMPALA-10159: Supporting ORC file format for Iceberg table

This patch mainly realizes querying Iceberg table with ORC
file format. We can using following SQL to create table with
ORC file format:
  CREATE TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
  )
  STORED AS ICEBERG
  LOCATION 'hdfs://xxx'
  TBLPROPERTIES ('iceberg.file_format'='orc', 
'iceberg.catalog'='hadoop.tables');
But pay attention, there still some problems when scan ORC files
with Timestamp, more details please refer IMPALA-9967. We may add
new tests with Timestmap type after this JIRA fixed.

Testing:
- Create table tests in functional_schema_template.sql
- Iceberg table create test in test_iceberg.py
- Iceberg table query test in test_scanners.py

Change-Id: Ib579461aa57348c9893a6d26a003a0d812346c4d
Reviewed-on: http://gerrit.cloudera.org:8080/16568
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 


> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
>  Labels: impala-iceberg
> Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, 
> 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc
>
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-08-28 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186464#comment-17186464
 ] 

Zoltán Borók-Nagy commented on IMPALA-9967:
---

So the problem is that the writer writes TIMESTAMP_INSTANT which is "timestamp 
with local time zone".

The C++ ORC library doesn't support this type yet, only TIMESTAMP.

TIMESTAMP_INSTANT is a relatively new addition to ORC, it is not even mentioned 
in the spec currently: [https://orc.apache.org/specification/ORCv2/]

It is added by ORC-189, and part of the 1.6 release. This means Hive also can't 
read such files since it is currently using ORC 1.5.10:
{noformat}
0: jdbc:hive2://localhost:11050/default> select * from orc_test;
Error: java.io.IOException: java.lang.RuntimeException: ORC split generation 
failed with exception: java.io.IOException: Type 4 has an unknown kind. 
(state=,code=0){noformat}

> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
> Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, 
> 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc
>
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-08-27 Thread WangSheng (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186208#comment-17186208
 ] 

WangSheng commented on IMPALA-9967:
---

{code:java}
create external table orc_test(
id int, user string, action string, event_time timestamp) 
stored as orc 
location 'hdfs://localhost:20500/orc_table_test';
{code}
This file contains timestamp column, create external table by this file, select 
will throw exception.
 [^00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc] 


{code:java}
create external table orc_test2(
id int, user string, action string) 
stored as orc 
location 'hdfs://localhost:20500/orc_table_test2';
{code}
This file does not contains timestamp column, and create external table by this 
file, select returns success.
 [^00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc] 



> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
> Attachments: 00031-31-26ff2064-c8f2-467f-ab7e-1949cb30d151-0.orc, 
> 00031-31-334beaba-ef4b-4d13-b338-e715cdf0ef85-0.orc
>
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-08-27 Thread Jira


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185902#comment-17185902
 ] 

Zoltán Borók-Nagy commented on IMPALA-9967:
---

[~skyyws] could you please attach a small ORC file that reproduces the error? 
Thanks

> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-07-20 Thread Gabor Kaszab (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160950#comment-17160950
 ] 

Gabor Kaszab commented on IMPALA-9967:
--

AFAIK Timestamp with ORC should work well without any issues. I see a number of 
patches in this topic made by [~csringhofer]. Csaba, as you have more 
experience with timestamps on ORC you might have anything relevant to add.

On a sidenote, I remember testing ORC with data created by Hive but I'm unsure 
if we tested with data from Spark.

> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Commented] (IMPALA-9967) Scan orc failed when table contains timestamp column

2020-07-17 Thread Tim Armstrong (Jira)


[ 
https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17160014#comment-17160014
 ] 

Tim Armstrong commented on IMPALA-9967:
---

[~gaborkaszab] [~luksan] is this a known issue?

> Scan orc failed when table contains timestamp column
> 
>
> Key: IMPALA-9967
> URL: https://issues.apache.org/jira/browse/IMPALA-9967
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 4.0
>Reporter: WangSheng
>Priority: Minor
>
> Recently, when I test impala query orc table, I found that scanning failed 
> when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f720002] 
> Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> @  0x1c9f753  impala::Status::Status()
> @  0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
> @  0x27a7fb3  impala::HdfsOrcScanner::Open()
> @  0x27365fe  
> impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
> @  0x28cb379  impala::HdfsScanNode::ProcessSplit()
> @  0x28caa7d  impala::HdfsScanNode::ScannerThread()
> @  0x28c9de5  
> _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
> @  0x28cc19e  
> _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
> @  0x205  boost::function0<>::operator()()
> @  0x2675d93  impala::Thread::SuperviseThread()
> @  0x267dd30  boost::_bi::list5<>::operator()<>()
> @  0x267dc54  boost::_bi::bind_t<>::operator()()
> @  0x267dc15  boost::detail::thread_data<>::run()
> @  0x3e3c3c1  thread_proxy
> @ 0x7f32360336b9  start_thread
> @ 0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 
> 68436a6e0883be84:53877f720002] Error preparing scanner for scan range 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc(0:582).
>  Encountered parse error in tail of ORC file 
> hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-0.orc:
>  Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query 
> success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org