date:20181115

[jira] [Updated] (IMPALA-6656) Metrics for time spent in BufferAllocator

2018-11-15 Thread Tim Armstrong (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-6656:
--
Description: 
We should track the total time spent and the time spent in TCMalloc so we can 
understand where time is going globally. 

I think we should shard them by CurrentCore() to avoid contention and get more 
granular metrics. We want a timer for the amount of time spent in 
SystemAllocator. We probably also want counters for how many times we go down 
each code path in BufferAllocator::AllocateInternal() (i.e. getting a hit 
immediately in the local area, evicting a clean page, etc down to doing a full 
locked scavenge).

  was:
We should track the total time spent and the time spent in TCMalloc so we can 
understand where time is going globally. 

I think we should shard these metrics across the arenas so we can see if the 
problem is just per-arena, and also to avoid contention between threads when 
updating the metrics.


> Metrics for time spent in BufferAllocator
> -
>
> Key: IMPALA-6656
> URL: https://issues.apache.org/jira/browse/IMPALA-6656
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Tim Armstrong
>Priority: Major
>  Labels: observability, resource-management
>
> We should track the total time spent and the time spent in TCMalloc so we can 
> understand where time is going globally. 
> I think we should shard them by CurrentCore() to avoid contention and get 
> more granular metrics. We want a timer for the amount of time spent in 
> SystemAllocator. We probably also want counters for how many times we go down 
> each code path in BufferAllocator::AllocateInternal() (i.e. getting a hit 
> immediately in the local area, evicting a clean page, etc down to doing a 
> full locked scavenge).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7859) Nessus Scan find CGI Generic SQL Injection.

2018-11-15 Thread Donghui Xu (JIRA)

Donghui Xu created IMPALA-7859:
--

 Summary: Nessus Scan find CGI Generic SQL Injection.
 Key: IMPALA-7859
 URL: https://issues.apache.org/jira/browse/IMPALA-7859
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Donghui Xu


The nessus scan report shows that the 25000 port and the 25020 port contain the 
risk of SQL injection, as follows:
+ The following resources may be vulnerable to blind SQL injection :
+ The 'object_type' parameter of the /catalog_object CGI :
/catalog_object?object_name=_impala_builtins_type=DATABASEzz_impa
la_builtins_type=DATABASEyy

How can I solve this problem? Thanks.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

2018-11-15 Thread Jim Apple (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688853#comment-16688853
 ] 

Jim Apple commented on IMPALA-7858:
---

Patch for review: http://gerrit.cloudera.org:8080/11938

> run-workload.py should support Beeswax's LDAP authentication
> 
>
> Key: IMPALA-7858
> URL: https://issues.apache.org/jira/browse/IMPALA-7858
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> {{impala-shell}} supports LDAP authentication with the {{\-\-user}} and 
> {{--ldap}} flags. {{run-workload.py}}, which can user beeswax, the same 
> interface as {{impala-shell}}, should support that, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

2018-11-15 Thread Jim Apple (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7858 started by Jim Apple.
-
> run-workload.py should support Beeswax's LDAP authentication
> 
>
> Key: IMPALA-7858
> URL: https://issues.apache.org/jira/browse/IMPALA-7858
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> {{impala-shell}} supports LDAP authentication with the {{\-\-user}} and 
> {{--ldap}} flags. {{run-workload.py}}, which can user beeswax, the same 
> interface as {{impala-shell}}, should support that, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

2018-11-15 Thread Jim Apple (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Apple updated IMPALA-7858:
--
Description: {{impala-shell}} supports LDAP authentication with the 
{{\-\-user}} and {{--ldap}} flags. {{run-workload.py}}, which can user beeswax, 
the same interface as {{impala-shell}}, should support that, too.  (was: 
{{impala-shell}} supports LDAP authentication with the {{--user}} and 
{{--ldap}} flags. {{run-workload.py}}, which can user beeswax, the same 
interface as {{impala-shell}}, should support that, too.)

> run-workload.py should support Beeswax's LDAP authentication
> 
>
> Key: IMPALA-7858
> URL: https://issues.apache.org/jira/browse/IMPALA-7858
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Affects Versions: Impala 3.0
>Reporter: Jim Apple
>Assignee: Jim Apple
>Priority: Minor
>
> {{impala-shell}} supports LDAP authentication with the {{\-\-user}} and 
> {{--ldap}} flags. {{run-workload.py}}, which can user beeswax, the same 
> interface as {{impala-shell}}, should support that, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-3531) Implement FK/PK "rely novalidate" constraints for better CBO

2018-11-15 Thread Anurag Mantripragada (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-3531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anurag Mantripragada reassigned IMPALA-3531:


Assignee: Anurag Mantripragada

> Implement FK/PK "rely novalidate" constraints for better CBO
> 
>
> Key: IMPALA-3531
> URL: https://issues.apache.org/jira/browse/IMPALA-3531
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Catalog, Frontend, Perf Investigation
>Affects Versions: Impala 2.5.0, Impala 2.6.0
> Environment: CDH
>Reporter: Ruslan Dautkhanov
>Assignee: Anurag Mantripragada
>Priority: Minor
>  Labels: CBO, performance, ramp-up
>
> Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for 
> Hive to start with something like that for PK/FK constraints. So CBO has more 
> information for optimizations. It does not have to actually check if that 
> constraint is relationship is true; it can just "rely" on that constraint.
> https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
> So it would be helpful with join cardinality estimates, and with cases like 
> IMPALA-2929.
> https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
> "Overview of Constraint States":
> - Enforcement
> - Validation
> - Belief
> So FK/PK with "rely novalidate" will have Enforcement disabled but 
> Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
> It opens a lot of ways to do additional ways to optimize execution plans.
> As exxplined in Tom Kyte's "Metadata matters"
> http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
> pp.30 - "Tell us how the tables relate and we can remove them from the 
> plan...".
> pp.35 - "Tell us how the tables relate and we have more access paths 
> available...".
> Also it might be helpful when Impala is being integrated with Kudu as the 
> latter have to have a PK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

2018-11-15 Thread Jim Apple (JIRA)

Jim Apple created IMPALA-7858:
-

 Summary: run-workload.py should support Beeswax's LDAP 
authentication
 Key: IMPALA-7858
 URL: https://issues.apache.org/jira/browse/IMPALA-7858
 Project: IMPALA
  Issue Type: Improvement
  Components: Infrastructure
Affects Versions: Impala 3.0
Reporter: Jim Apple
Assignee: Jim Apple


{{impala-shell}} supports LDAP authentication with the {{--user}} and 
{{--ldap}} flags. {{run-workload.py}}, which can user beeswax, the same 
interface as {{impala-shell}}, should support that, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

2018-11-15 Thread vietn (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688815#comment-16688815
 ] 

vietn commented on IMPALA-7854:
---

I don't think it is possible since I don't have the authority to upgrade our 
cluster. The LOAD DATA does similar to IMPALA-7330, but how about the ALTER 
TABLE?

 

 

> Slow ALTER TABLE and LOAD DATA statements for tables with large number of 
> partitions
> 
>
> Key: IMPALA-7854
> URL: https://issues.apache.org/jira/browse/IMPALA-7854
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
> Environment: 14 Nodes
> Table in question has 20 columns, 3 partition columns, and 57,475 partitions
>Reporter: vietn
>Priority: Critical
>  Labels: impala, performance
>
> ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
> and 6 minutes for LOAD DATA) for tables with a large number of partitions.
> Our workaround was to use Hive to perform the LOAD DATA and then perform a 
> REFRESH PARTITION using Impala.
>  * 14 Nodes
>  * Table in question has 20 columns, 3 partition columns, and 57,475 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6755) Impala Doc: Doc functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()

2018-11-15 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-6755:

Summary: Impala Doc: Doc functions PERCENTILE_DISC(), PERCENTILE_CONT(), 
and MEDIAN()  (was: Impala 2.13 Doc: Doc functions PERCENTILE_DISC(), 
PERCENTILE_CONT(), and MEDIAN())

> Impala Doc: Doc functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()
> 
>
> Key: IMPALA-6755
> URL: https://issues.apache.org/jira/browse/IMPALA-6755
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Affects Versions: Impala 2.13.0
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-5847) Some query options do not work as expected in .test files

2018-11-15 Thread Thomas Tauber-Marshall (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall reassigned IMPALA-5847:
--

Assignee: Thomas Tauber-Marshall

> Some query options do not work as expected in .test files
> -
>
> Key: IMPALA-5847
> URL: https://issues.apache.org/jira/browse/IMPALA-5847
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Infrastructure
>Reporter: Alexander Behm
>Assignee: Thomas Tauber-Marshall
>Priority: Minor
>
> We often use "set" in .test files to alter query options. Theoretically, a 
> "set" command should change the session-level query options and in most cases 
> a single .test file is executed from the same Impala session. However, for 
> some options using "set" within a query section does not seem to work. For 
> example, "num_nodes" does not work as expected as shown below.
> PyTest:
> {code}
> import pytest
> from tests.common.impala_test_suite import ImpalaTestSuite
> class TestStringQueries(ImpalaTestSuite):
>   @classmethod
>   def get_workload(cls):
> return 'functional-query'
>   def test_set_bug(self, vector):
> self.run_test_case('QueryTest/set_bug', vector)
> {code}
> Corresponding .test file:
> {code}
> 
>  QUERY
> set num_nodes=1;
> select count(*) from functional.alltypes;
> select count(*) from functional.alltypes;
> select count(*) from functional.alltypes;
>  RESULTS
> 7300
>  TYPES
> BIGINT
> 
> {code}
> After running the test above, I validated that the 3 queries were run from 
> the same session, and that the queries run a distributed plan. The 
> "num_nodes" option was definitely not picked up. I am not sure which query 
> options are affected. In several .test files setting other query options does 
> seem to work as expected.
> I suspect that the test framework might keep its own list of default query 
> options which get submitted together with the query, so the session-level 
> options are overridden on a per-request basis. For example, if I change the 
> pytest to remove the "num_nodes" dictionary entry, then the test works as 
> expected.
> PyTest workaround:
> {code}
> import pytest
> from tests.common.impala_test_suite import ImpalaTestSuite
> class TestStringQueries(ImpalaTestSuite):
>   @classmethod
>   def get_workload(cls):
> return 'functional-query'
>   def test_set_bug(self, vector):
> # Workaround SET bug
> vector.get_value('exec_option').pop('num_nodes', None)
> self.run_test_case('QueryTest/set_bug', vector)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-7856) test_exchange_mem_usage_scaling failing, not hitting expected OOM

2018-11-15 Thread Bikramjeet Vig (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig reassigned IMPALA-7856:
--

Assignee: Bikramjeet Vig

> test_exchange_mem_usage_scaling failing, not hitting expected OOM
> -
>
> Key: IMPALA-7856
> URL: https://issues.apache.org/jira/browse/IMPALA-7856
> Project: IMPALA
>  Issue Type: Bug
>Affects Versions: Impala 3.2.0
>Reporter: Bikramjeet Vig
>Assignee: Bikramjeet Vig
>Priority: Critical
>  Labels: broken-build, flaky-test
>
> {noformat}
> query_test/test_mem_usage_scaling.py:386: in test_exchange_mem_usage_scaling
> self.run_test_case('QueryTest/exchange-mem-scaling', vector)
> common/impala_test_suite.py:482: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: Memory limit exceeded
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7856) test_exchange_mem_usage_scaling failing, not hitting expected OOM

2018-11-15 Thread Bikramjeet Vig (JIRA)

Bikramjeet Vig created IMPALA-7856:
--

 Summary: test_exchange_mem_usage_scaling failing, not hitting 
expected OOM
 Key: IMPALA-7856
 URL: https://issues.apache.org/jira/browse/IMPALA-7856
 Project: IMPALA
  Issue Type: Bug
Affects Versions: Impala 3.2.0
Reporter: Bikramjeet Vig



{noformat}
query_test/test_mem_usage_scaling.py:386: in test_exchange_mem_usage_scaling
self.run_test_case('QueryTest/exchange-mem-scaling', vector)
common/impala_test_suite.py:482: in run_test_case
assert False, "Expected exception: %s" % expected_str
E   AssertionError: Expected exception: Memory limit exceeded
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

2018-11-15 Thread Tim Armstrong (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688669#comment-16688669
 ] 

Tim Armstrong commented on IMPALA-7087:
---

I don't have an objection. Usually I think we should err on the side of safety 
and not implicitly convert things, but I can see valid workflows where you'd 
want to reduce precision.

What does Impala do for decimal columns in text tables with extra precision?

> Impala is unable to read Parquet decimal columns with lower precision/scale 
> than table metadata
> ---
>
> Key: IMPALA-7087
> URL: https://issues.apache.org/jira/browse/IMPALA-7087
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: decimal, parquet
>
> This is similar to IMPALA-2515, except relates to a different precision/scale 
> in the file metadata rather than just a mismatch in the bytes used to store 
> the data. In a lot of cases we should be able to convert the decimal type on 
> the fly to the higher-precision type.
> {noformat}
> ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
> type length. Expecting: 11 len in file: 8
> {noformat}
> It would be convenient to allow reading parquet files where the 
> precision/scale in the file can be converted to the precision/scale in the 
> table metadata without loss of precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

2018-11-15 Thread Sahil Takiar (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688645#comment-16688645
 ] 

Sahil Takiar commented on IMPALA-7087:
--

Do we want to tackle Parquet files that have a higher scale compared to the 
table (e.g. a Parquet file written with scale = 4 being loaded into a table 
with scale = 2)? It seems like this is a valid pattern in other databases. The 
returned values just have their least significant digits truncated. Here is 
what other SQL engines do:

*Postgres:*

Postgres is able to load data with a higher scale into a table with a lower 
scale.

{code:java}
postgres@stakiar-desktop:~$ printf "col1\n1.111" > /tmp/tmp.txt
test=# create table dec_test (dec_col decimal(10,2));
test=# copy dec_test(dec_col) from '/tmp/tmp.txt' delimiter ',' csv header;
test=# select * from dec_test;
 dec_col 
-
1.11
(1 row)
{code}
The data was written to {{/tmp/tmp.txt}} as {{1.111}}, but is returned by 
Postgres as {{1.11}}.

*Hive:*

Hive follows the same behavior as Postgres.
{code:java}
create table dec_test_high_scale (dec_col decimal(10,4)) stored as parquet;
insert into table dec_test_high_scale values (1.);
create table dec_test_low_scale (dec_col decimal(10,2)) stored as parquet 
location 'hdfs://[nn]:[port]/user/hive/warehouse/dec_test_high_scale';
select * from dec_test_low_scale;
1.11
{code}

> Impala is unable to read Parquet decimal columns with lower precision/scale 
> than table metadata
> ---
>
> Key: IMPALA-7087
> URL: https://issues.apache.org/jira/browse/IMPALA-7087
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Backend
>Reporter: Tim Armstrong
>Assignee: Sahil Takiar
>Priority: Major
>  Labels: decimal, parquet
>
> This is similar to IMPALA-2515, except relates to a different precision/scale 
> in the file metadata rather than just a mismatch in the bytes used to store 
> the data. In a lot of cases we should be able to convert the decimal type on 
> the fly to the higher-precision type.
> {noformat}
> ERROR: File '/hdfs/path/00_0_x_2' column 'alterd_decimal' has an invalid 
> type length. Expecting: 11 len in file: 8
> {noformat}
> It would be convenient to allow reading parquet files where the 
> precision/scale in the file can be converted to the precision/scale in the 
> table metadata without loss of precision.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

2018-11-15 Thread bharath v (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688556#comment-16688556
 ] 

bharath v commented on IMPALA-7854:
---

Looks similar to IMPALA-7330. Could you try it out on the latest version if 
possible?

> Slow ALTER TABLE and LOAD DATA statements for tables with large number of 
> partitions
> 
>
> Key: IMPALA-7854
> URL: https://issues.apache.org/jira/browse/IMPALA-7854
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
> Environment: 14 Nodes
> Table in question has 20 columns, 3 partition columns, and 57,475 partitions
>Reporter: vietn
>Priority: Critical
>  Labels: impala, performance
>
> ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
> and 6 minutes for LOAD DATA) for tables with a large number of partitions.
> Our workaround was to use Hive to perform the LOAD DATA and then perform a 
> REFRESH PARTITION using Impala.
>  * 14 Nodes
>  * Table in question has 20 columns, 3 partition columns, and 57,475 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7855) Excessive type widening leads to unnecessary casts

2018-11-15 Thread Paul Rogers (JIRA)

Paul Rogers created IMPALA-7855:
---

 Summary: Excessive type widening leads to unnecessary casts
 Key: IMPALA-7855
 URL: https://issues.apache.org/jira/browse/IMPALA-7855
 Project: IMPALA
  Issue Type: Improvement
  Components: Frontend
Affects Versions: Impala 3.0
Reporter: Paul Rogers


When writing unit tests, created the following query:

{code:sql}
with
  query1 (a, b) as (
select 1 + 1 + id, 2 + 3 + int_col from functional.alltypestiny)
insert into functional.alltypestiny (id, int_col)
  partition (month = 5, year = 2018)
  select * from query1
{code}

The above fails with the following error:

{noformat}
ERROR: AnalysisException: Possible loss of precision for target table 
'functional.alltypestiny'.
Expression 'query1.a' (type: BIGINT) would need to be cast to INT for column 
'id'
{noformat}

The following does work (for planning, may not actually execute):

{code:sql}
with
  query1 (a, b) as (
select
  cast(1 + 1 + id as int),
  cast(2 + 3 + int_col as int)
from functional.alltypestiny)
insert into functional.alltypestiny (id, int_col)
  partition (month = 5, year = 2018)
  select * from query1
{code}

What this says is the the planner selected type {{BIGINT}} for the (rewritten) 
expression {{2 + id}} where {{id}} is of type {{INT}}. {{BIGINT}} is a 
conservative guess: adding 2 to the largest {{INT}} could overflow and require 
a {{BIGINT}}.

Yet, for such a simple case, such aggressive type promotion may be overly 
cautious.

To verify that this is an issue, let's try something similar with Postgres to 
see if it is as aggressive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

2018-11-15 Thread vietn (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vietn updated IMPALA-7854:
--
Description: 
ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
and 6 minutes for LOAD DATA) for tables with a large number of partitions.

Our workaround was to use Hive to perform the LOAD DATA and then perform a 
REFRESH PARTITION using Impala.
 * 14 Nodes
 * Table in question has 20 columns, 3 partition columns, and 57,475 partitions

  was:
ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
and 6 minutes for LOAD DATA) for tables with a large number of partitions.

Our workaround was to use Hive to perform the LOAD DATA and then perform a 
REFRESH PARTITION using Impala.


> Slow ALTER TABLE and LOAD DATA statements for tables with large number of 
> partitions
> 
>
> Key: IMPALA-7854
> URL: https://issues.apache.org/jira/browse/IMPALA-7854
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.12.0
> Environment: 14 Nodes
> Table in question has 20 columns, 3 partition columns, and 57,475 partitions
>Reporter: vietn
>Priority: Critical
>  Labels: impala, performance
>
> ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
> and 6 minutes for LOAD DATA) for tables with a large number of partitions.
> Our workaround was to use Hive to perform the LOAD DATA and then perform a 
> REFRESH PARTITION using Impala.
>  * 14 Nodes
>  * Table in question has 20 columns, 3 partition columns, and 57,475 
> partitions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

2018-11-15 Thread Viet Nguyen (JIRA)

Viet Nguyen created IMPALA-7854:
---

 Summary: Slow ALTER TABLE and LOAD DATA statements for tables with 
large number of partitions
 Key: IMPALA-7854
 URL: https://issues.apache.org/jira/browse/IMPALA-7854
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Affects Versions: Impala 2.12.0
 Environment: 14 Nodes
Table in question has 20 columns, 3 partition columns, and 57,475 partitions
Reporter: Viet Nguyen


ALTER TABLE and LOAD DATA statements take minutes (9 minutes for ALTER TABLE 
and 6 minutes for LOAD DATA) for tables with a large number of partitions.

Our workaround was to use Hive to perform the LOAD DATA and then perform a 
REFRESH PARTITION using Impala.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-7837) SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build

2018-11-15 Thread Bikramjeet Vig (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikramjeet Vig resolved IMPALA-7837.

   Resolution: Fixed
Fix Version/s: Impala 3.2.0

> SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build
> 
>
> Key: IMPALA-7837
> URL: https://issues.apache.org/jira/browse/IMPALA-7837
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Bikramjeet Vig
>Priority: Blocker
> Fix For: Impala 3.2.0
>
> Attachments: impala-logs.tar.gz
>
>
> This test is not raising the expected exception on a *release build*:
> {noformat}
>  QUERY
> # Query should fail due to exceeding scan bytes limit.
> set SCAN_BYTES_LIMIT="100M";
> select count(*) from tpch.lineitem l1,tpch.lineitem l2, tpch.lineitem l3 where
>  l1.l_suppkey = l2.l_linenumber and l1.l_orderkey = l2.l_orderkey and
>  l1.l_orderkey = l3.l_orderkey group by l1.l_comment, l2.l_comment
>  having count(*) = 99
>  CATCH
> row_regex:.*terminated due to scan bytes limit of 100.00 M.
> {noformat}
> {noformat}
> Stacktrace
> query_test/test_resource_limits.py:46: in test_resource_limits
> self.run_test_case('QueryTest/query-resource-limits', vector)
> common/impala_test_suite.py:482: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: row_regex:.*terminated due to scan 
> bytes limit of 100.00 M.*
> {noformat}
> It fails deterministically in CI (3 times in a row). I can't find a query 
> profile matching the query ID for some reason, but I've attached logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Commented] (IMPALA-7837) SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build

2018-11-15 Thread ASF subversion and git services (JIRA)



[ 
https://issues.apache.org/jira/browse/IMPALA-7837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688505#comment-16688505
 ] 

ASF subversion and git services commented on IMPALA-7837:
-

Commit 0d0356c9329bf0cf0e7c69dee42f2b8b1315ec05 in impala's branch 
refs/heads/master from [~bikram.sngh91]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=0d0356c ]

IMPALA-7837: Fix flakiness in test_resource_limits for release builds

test_resource_limits was failing in release build because the queries
used were finishing earlier than expected. This resulted in fragment
instances not being able to send enough updates to the coordinator in
order to hit the limits used for the tests. This patches adds a
deterministic sleep to the queries which gives enough time to the
coordinator to catch up on reports.

Testing:
Checked that tests passed on release builds.

Change-Id: I4a47391e52f3974db554dfc0d38139d3ee18a1b4
Reviewed-on: http://gerrit.cloudera.org:8080/11933
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 


> SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build
> 
>
> Key: IMPALA-7837
> URL: https://issues.apache.org/jira/browse/IMPALA-7837
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.2.0
>Reporter: Michael Brown
>Assignee: Bikramjeet Vig
>Priority: Blocker
> Attachments: impala-logs.tar.gz
>
>
> This test is not raising the expected exception on a *release build*:
> {noformat}
>  QUERY
> # Query should fail due to exceeding scan bytes limit.
> set SCAN_BYTES_LIMIT="100M";
> select count(*) from tpch.lineitem l1,tpch.lineitem l2, tpch.lineitem l3 where
>  l1.l_suppkey = l2.l_linenumber and l1.l_orderkey = l2.l_orderkey and
>  l1.l_orderkey = l3.l_orderkey group by l1.l_comment, l2.l_comment
>  having count(*) = 99
>  CATCH
> row_regex:.*terminated due to scan bytes limit of 100.00 M.
> {noformat}
> {noformat}
> Stacktrace
> query_test/test_resource_limits.py:46: in test_resource_limits
> self.run_test_case('QueryTest/query-resource-limits', vector)
> common/impala_test_suite.py:482: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: row_regex:.*terminated due to scan 
> bytes limit of 100.00 M.*
> {noformat}
> It fails deterministically in CI (3 times in a row). I can't find a query 
> profile matching the query ID for some reason, but I've attached logs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7725) Impala Doc: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner

2018-11-15 Thread Alex Rodoni (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Rodoni updated IMPALA-7725:

Labels: future_release_doc in_32  (was: future_release_doc)

> Impala Doc: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the 
> parquet scanner
> 
>
> Key: IMPALA-7725
> URL: https://issues.apache.org/jira/browse/IMPALA-7725
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Docs
>Reporter: Alex Rodoni
>Assignee: Alex Rodoni
>Priority: Major
>  Labels: future_release_doc, in_32
>
> Add to doc that CREATE TABLE LIKE PARQUET still interprets these columns as 
> BIGINT.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Assigned] (IMPALA-6924) Compute stats profiles should include reference to child queries

2018-11-15 Thread Thomas Tauber-Marshall (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-6924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Tauber-Marshall reassigned IMPALA-6924:
--

Assignee: Thomas Tauber-Marshall

> Compute stats profiles should include reference to child queries
> 
>
> Key: IMPALA-6924
> URL: https://issues.apache.org/jira/browse/IMPALA-6924
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Tim Armstrong
>Assignee: Thomas Tauber-Marshall
>Priority: Major
>  Labels: observability, supportability
>
> "Compute stats" queries spawn off child queries that do most of the work. 
> It's non-trivial to track down the child queries and get their profiles if 
> something goes wrong. We really should have, at a minimum, the query IDs of 
> the child queries in the parent's profile and vice-versa.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-7853:

Description: 
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps in Impala + Parquet.

Another way NANO differs from MICRO and MILLI is that NANO can be only 
described with new logical types in Parquet, it has no converted type 
equivalent.

  was:
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps in Impala + Parquet.

Another way NANO differs from MICRO and MILLI is that NANO can be be only 
described with new logical types in Parquet, it has no converted type 
equivalent.


> Add support to read int64 NANO timestamps to the parquet scanner
> 
>
> Key: IMPALA-7853
> URL: https://issues.apache.org/jira/browse/IMPALA-7853
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> PARQUET-1387 added int64 timestamps with nanosecond precision.
> As 64 bits are not enough to represent the whole 1400.. range of Impala 
> timestamps,  this new new type works with a limited range:
> 1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC
> The benefit of the reduced range is that no validation is necessary during 
> scanning, as every possible 64 bit value represents a valid timestamp in 
> Impala. This may mean that this has the potential be the fastest way to store 
> timestamps in Impala + Parquet.
> Another way NANO differs from MICRO and MILLI is that NANO can be only 
> described with new logical types in Parquet, it has no converted type 
> equivalent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-7853:

Description: 
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps in Impala + Parquet.

Another way NANO differs from MICRO and MILLI is that NANO can be be only 
described with new logical types in Parquet, it has no converted type 
equivalent.

  was:
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  so this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps.

Another way NANO differs from MICRO and MILLI is that NANO can be be only 
described with new logical types in Parquet, it has no converted type 
equivalent.


> Add support to read int64 NANO timestamps to the parquet scanner
> 
>
> Key: IMPALA-7853
> URL: https://issues.apache.org/jira/browse/IMPALA-7853
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> PARQUET-1387 added int64 timestamps with nanosecond precision.
> As 64 bits are not enough to represent the whole 1400.. range of Impala 
> timestamps,  this new new type works with a limited range:
> 1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC
> The benefit of the reduced range is that no validation is necessary during 
> scanning, as every possible 64 bit value represents a valid timestamp in 
> Impala. This may mean that this has the potential be the fastest way to store 
> timestamps in Impala + Parquet.
> Another way NANO differs from MICRO and MILLI is that NANO can be be only 
> described with new logical types in Parquet, it has no converted type 
> equivalent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer updated IMPALA-7853:

Description: 
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  so this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps.

Another way NANO differs from MICRO and MILLI is that NANO can be be only 
described with new logical types in Parquet, it has no converted type 
equivalent.

  was:
PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  so this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps.


> Add support to read int64 NANO timestamps to the parquet scanner
> 
>
> Key: IMPALA-7853
> URL: https://issues.apache.org/jira/browse/IMPALA-7853
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> PARQUET-1387 added int64 timestamps with nanosecond precision.
> As 64 bits are not enough to represent the whole 1400.. range of Impala 
> timestamps,  so this new new type works with a limited range:
> 1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC
> The benefit of the reduced range is that no validation is necessary during 
> scanning, as every possible 64 bit value represents a valid timestamp in 
> Impala. This may mean that this has the potential be the fastest way to store 
> timestamps.
> Another way NANO differs from MICRO and MILLI is that NANO can be be only 
> described with new logical types in Parquet, it has no converted type 
> equivalent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Work started] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-7853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on IMPALA-7853 started by Csaba Ringhofer.
---
> Add support to read int64 NANO timestamps to the parquet scanner
> 
>
> Key: IMPALA-7853
> URL: https://issues.apache.org/jira/browse/IMPALA-7853
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Backend
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Major
>
> PARQUET-1387 added int64 timestamps with nanosecond precision.
> As 64 bits are not enough to represent the whole 1400.. range of Impala 
> timestamps,  so this new new type works with a limited range:
> 1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC
> The benefit of the reduced range is that no validation is necessary during 
> scanning, as every possible 64 bit value represents a valid timestamp in 
> Impala. This may mean that this has the potential be the fastest way to store 
> timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Created] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)

Csaba Ringhofer created IMPALA-7853:
---

 Summary: Add support to read int64 NANO timestamps to the parquet 
scanner
 Key: IMPALA-7853
 URL: https://issues.apache.org/jira/browse/IMPALA-7853
 Project: IMPALA
  Issue Type: Improvement
  Components: Backend
Reporter: Csaba Ringhofer
Assignee: Csaba Ringhofer


PARQUET-1387 added int64 timestamps with nanosecond precision.

As 64 bits are not enough to represent the whole 1400.. range of Impala 
timestamps,  so this new new type works with a limited range:
1677-09-21 00:12:43.145224192  .. 2262-04-11 23:47:16.854775807 UTC

The benefit of the reduced range is that no validation is necessary during 
scanning, as every possible 64 bit value represents a valid timestamp in 
Impala. This may mean that this has the potential be the fastest way to store 
timestamps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-5050) Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner

2018-11-15 Thread Csaba Ringhofer (JIRA)



 [ 
https://issues.apache.org/jira/browse/IMPALA-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Ringhofer resolved IMPALA-5050.
-
   Resolution: Implemented
Fix Version/s: Impala 3.2.0

> Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet 
> scanner
> 
>
> Key: IMPALA-5050
> URL: https://issues.apache.org/jira/browse/IMPALA-5050
> Project: IMPALA
>  Issue Type: New Feature
>  Components: Backend
>Affects Versions: Impala 2.9.0
>Reporter: Lars Volker
>Assignee: Csaba Ringhofer
>Priority: Major
> Fix For: Impala 3.2.0
>
>
> This requires updating {{parquet.thrift}} to a version that includes the 
> {{TIMESTAMP_MICROS}} logical type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-6656) Metrics for time spent in BufferAllocator

[jira] [Created] (IMPALA-7859) Nessus Scan find CGI Generic SQL Injection.

[jira] [Commented] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

[jira] [Work started] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

[jira] [Updated] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

[jira] [Assigned] (IMPALA-3531) Implement FK/PK "rely novalidate" constraints for better CBO

[jira] [Created] (IMPALA-7858) run-workload.py should support Beeswax's LDAP authentication

[jira] [Commented] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

[jira] [Updated] (IMPALA-6755) Impala Doc: Doc functions PERCENTILE_DISC(), PERCENTILE_CONT(), and MEDIAN()

[jira] [Assigned] (IMPALA-5847) Some query options do not work as expected in .test files

[jira] [Assigned] (IMPALA-7856) test_exchange_mem_usage_scaling failing, not hitting expected OOM

[jira] [Created] (IMPALA-7856) test_exchange_mem_usage_scaling failing, not hitting expected OOM

[jira] [Commented] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

[jira] [Commented] (IMPALA-7087) Impala is unable to read Parquet decimal columns with lower precision/scale than table metadata

[jira] [Commented] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

[jira] [Created] (IMPALA-7855) Excessive type widening leads to unnecessary casts

[jira] [Updated] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

[jira] [Created] (IMPALA-7854) Slow ALTER TABLE and LOAD DATA statements for tables with large number of partitions

[jira] [Resolved] (IMPALA-7837) SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build

[jira] [Commented] (IMPALA-7837) SCAN_BYTES_LIMIT="100M" test failing to raise exception in release build

[jira] [Updated] (IMPALA-7725) Impala Doc: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner

[jira] [Assigned] (IMPALA-6924) Compute stats profiles should include reference to child queries

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

[jira] [Updated] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

[jira] [Work started] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

[jira] [Created] (IMPALA-7853) Add support to read int64 NANO timestamps to the parquet scanner

[jira] [Resolved] (IMPALA-5050) Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS to the parquet scanner

28 matches

Site Navigation

Mail list logo

Footer information