[jira] [Commented] (IMPALA-6020) REFRESH statement cannot detect HDFS block movement

2018-05-31 Thread Dimitris Tsirogiannis (JIRA)


[ 
https://issues.apache.org/jira/browse/IMPALA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497571#comment-16497571
 ] 

Dimitris Tsirogiannis commented on IMPALA-6020:
---

[~arodoni_cloudera], I believe it should be removed. 

> REFRESH statement cannot detect HDFS block movement
> ---
>
> Key: IMPALA-6020
> URL: https://issues.apache.org/jira/browse/IMPALA-6020
> Project: IMPALA
>  Issue Type: Bug
>  Components: Docs
>Affects Versions: Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
>Reporter: Jim Apple
>Assignee: Alex Rodoni
>Priority: Major
>
> In the release notes, it says
> http://impala.apache.org/docs/build/html/topics/impala_new_features.html
> {quote}The REFRESH statement now updates information about HDFS block 
> locations. Therefore, you can perform a fast and efficient REFRESH after 
> doing an HDFS rebalancing operation instead of the more expensive INVALIDATE 
> METADATA statement.
> {quote}
> However there is no change in HDFS or Impala side to support this. There may 
> be some misunderstanding. After hdfs load balancing, user still needs to run 
> INVALIDATE METADATA  to get latest block metadata.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-2115) Add metrics for tables with missing or incomplete stats

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-2115:
-

Assignee: (was: Dimitris Tsirogiannis)

> Add metrics for tables with missing or incomplete stats
> ---
>
> Key: IMPALA-2115
> URL: https://issues.apache.org/jira/browse/IMPALA-2115
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.2
>Reporter: Matthew Jacobs
>Priority: Minor
>  Labels: ramp-up, supportability
>
> Add catalogd metrics with number of tables with missing stats and number of 
> tables with incomplete stats. This information would need to be exposed to 
> the C++ catalog code over JNI, which is cumbersome unless we have IMPALA-2114.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5831) Create table operation stuck on statestore update

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-5831:
-

Assignee: (was: Dimitris Tsirogiannis)

> Create table operation stuck on statestore update
> -
>
> Key: IMPALA-5831
> URL: https://issues.apache.org/jira/browse/IMPALA-5831
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.10.0
>Reporter: Mostafa Mokhtar
>Priority: Minor
>  Labels: statestore, stuck, topic
>
> Noticed a create table statement being stuck for a couple of days. 
> {code}
> User  Default Db  Statement   Query Type  Start Time  
> DurationScan Progress   State   Last Event  # rows fetched  
> Resource Pool   Details Action
> t...@foo.com  tpcds_1000_parquet  CREATE TABLE store_sales_insert LIKE 
> store_salesDDL 2017-08-19 16:39:36.417982000   90h26m  N/A 
> CREATED Planning finished   0   Details Cancel
> {code}
> Coordinator log
> {code}
> I0819 16:39:36.418120 178699 Frontend.java:888] Compiling query: CREATE TABLE 
> store_sales_insert LIKE store_sales
> I0819 16:39:36.421865 178699 Frontend.java:927] Compiled query.
> I0819 16:39:36.529786 178699 impala-server.cc:1475] Waiting for catalog 
> version: 282 current version: 281
> I0819 16:39:40.296090 178699 impala-server.cc:1489] Waiting for min 
> subscriber topic version: 314 current version: 0
> I0820 00:04:10.183965 177679 authentication.cc:498] Registering 
> impala/vd1302.foo@foo.com, keytab file 
> /var/run/cloudera-scm-agent/process/3262-impala-IMPALAD/impala.keytab
> I0821 00:02:15.233804 177679 authentication.cc:498] Registering 
> impala/vd1302.foo@foo.com, keytab file 
> /var/run/cloudera-scm-agent/process/3262-impala-IMPALAD/impala.keytab
> I0822 00:00:22.234457 177679 authentication.cc:498] Registering 
> impala/vd1302.foo@foo.com, keytab file 
> /var/run/cloudera-scm-agent/process/3262-impala-IMPALAD/impala.keytab
> I0822 23:58:11.268100 177679 authentication.cc:498] Registering 
> impala/vd1302.foo@foo.com, keytab file 
> /var/run/cloudera-scm-agent/process/3262-impala-IMPALAD/impala.keytab
> {code}
> Catalog log
> {code}
> I0823 08:55:40.240483 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:05:40.350579 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:15:40.460618 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:25:40.558987 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:35:40.669566 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:45:40.780369 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 09:55:40.891625 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 10:05:40.990028 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 10:15:41.098749 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 10:25:41.205286 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 10:35:41.312054 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> I0823 10:45:41.808549 102562 catalog-server.cc:237] Catalog Version: 282 Last 
> Catalog Version: 282
> {code}
> Statestore log 
> {code}
> I0823 11:05:31.107852 102250 statestore.cc:476] Received request for 
> different delta base of topic: catalog-update from: 
> impa...@vd1305.foo.com:22000 subscriber from_version: 0
> I0823 11:05:33.108608 102241 statestore.cc:549] Preparing initial 
> catalog-update topic update for impa...@vd1305.foo.com:22000. Size = 7.83 MB
> I0823 11:05:33.502251 102241 statestore.cc:476] Received request for 
> different delta base of topic: catalog-update from: 
> impa...@vd1305.foo.com:22000 subscriber from_version: 0
> I0823 11:05:35.503106 102248 statestore.cc:549] Preparing initial 
> catalog-update topic update for impa...@vd1305.foo.com:22000. Size = 7.83 MB
> I0823 11:05:35.903831 102248 statestore.cc:476] Received request for 
> different delta base of topic: catalog-update from: 
> impa...@vd1305.foo.com:22000 subscriber from_version: 0
> I0823 11:05:37.904654 102254 statestore.cc:549] Preparing initial 
> catalog-update topic update for impa...@vd1305.foo.com:22000. Size = 7.83 MB
> {code}
> On vd1305 the StatestoreSubscriber-1 thread appears to be stuck in the stack 
> below:
> {code}
> Thread 1 (process 96924):
> #0  0x0035f960e82d in read () from /lib64/libpthread.so.0
> #1  0x0035fc2dea71 in ?? () from /usr/lib64/libcrypto.so.10
> #2  

[jira] [Assigned] (IMPALA-4870) Extend Catalog metrics

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-4870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-4870:
-

Assignee: (was: Dimitris Tsirogiannis)

> Extend Catalog metrics
> --
>
> Key: IMPALA-4870
> URL: https://issues.apache.org/jira/browse/IMPALA-4870
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.6.0
>Reporter: Mostafa Mokhtar
>Priority: Major
>  Labels: catalog-server, supportability
>
> It is difficult to tune and debug the Catalog performance as it doesn't 
> expose any metrics indicating the current state. 
> The Catalog should expose object count and estimate memory consumed by each 
> object type 
> * Partitions
> * Files
> * Blocks
> A breakdown per DB/Table/Partition would be ideal. 
> Also a measure of heap usage by the JVM along with GC time as a % of the JVM 
> lifetime.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-3173) Reduce catalog's memory footprint

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-3173:
-

Assignee: (was: Dimitris Tsirogiannis)

> Reduce catalog's memory footprint
> -
>
> Key: IMPALA-3173
> URL: https://issues.apache.org/jira/browse/IMPALA-3173
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.2.4
>Reporter: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: catalog-server, performance, usability
>
> An initial analysis of catalog's heap dumps shows that we can probably reduce 
> it's memory footprint by: a) avoid storing redundant information about 
> catalog entities such as partitions, b) using more compressed data 
> structures.  
> Currently, for a table with 2 int columns and 1 int partition column and 
> without incremental stats, we use:
> * *~930B* per partition out of which ~500B are used on hmsParameters_ 
> (Map),  ~190B on cachedMsPartitionDescriptor_,  and ~200B 
> (depending on path) on location.
> * *~800B* per file descriptor out of which ~530B go to file_blocks and the 
> rest are used for storing the file_name.
> * Every HdfsTable also uses two maps that replicate partition locations and 
> file names (e.g. perPartitionFileDescMap_ and nameToPartitionMap_). 
> A table like that with 100,000 partitions and 10 files per partition requires 
> 1GB and 1.4GB of memory w and w/o incremental stats, respectively. 
> This is a parent JIRA of IMPALA-2840.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6199) Flaky test: metadata/test_hdfs_permissions.py

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6199:
-

Assignee: (was: Dimitris Tsirogiannis)

> Flaky test: metadata/test_hdfs_permissions.py
> -
>
> Key: IMPALA-6199
> URL: https://issues.apache.org/jira/browse/IMPALA-6199
> Project: IMPALA
>  Issue Type: Bug
>  Components: Infrastructure
>Affects Versions: Impala 2.11.0
>Reporter: Taras Bobrovytsky
>Priority: Critical
>
> TestHdfsPermissions.test_insert_into_read_only_table failed on a nightly 
> Isilon build with the following error message:
> {code}
>  TestHdfsPermissions.test_insert_into_read_only_table[exec_option: 
> {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 5000, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: text/none] 
> [gw1] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-umbrella-build-and-test-isilon/repos/Impala/bin/../infra/python/env/bin/python
> metadata/test_hdfs_permissions.py:73: in test_insert_into_read_only_table
> self.hdfs_client.delete_file_dir('test-warehouse/%s' % TEST_TBL, 
> recursive=True)
> util/hdfs_util.py:90: in delete_file_dir
> if not self.exists(path):
> util/hdfs_util.py:138: in exists
> self.get_file_dir_status(path)
> util/hdfs_util.py:102: in get_file_dir_status
> return super(PyWebHdfsClientWithChmod, self).get_file_dir_status(path)
> ../infra/python/env/lib/python2.6/site-packages/pywebhdfs/webhdfs.py:338: in 
> get_file_dir_status
> _raise_pywebhdfs_exception(response.status_code, response.content)
> ../infra/python/env/lib/python2.6/site-packages/pywebhdfs/webhdfs.py:477: in 
> _raise_pywebhdfs_exception
> raise errors.PyWebHdfsException(msg=message)
> E   PyWebHdfsException: 
> E   
> E   403 Forbidden
> E   
> E   Forbidden
> E   You don't have permission to access /v1/html/500Text.html
> E   on this server.
> E   
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-5963) Extend Catalog metrics to list tables that are being loaded along with the ones in the queue

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-5963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-5963:
-

Assignee: (was: Dimitris Tsirogiannis)

> Extend Catalog metrics to list tables that are being loaded along with the 
> ones in the queue 
> -
>
> Key: IMPALA-5963
> URL: https://issues.apache.org/jira/browse/IMPALA-5963
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Reporter: Mostafa Mokhtar
>Priority: Major
>
> The Catalog log prints the information below which doesn't clearly show what 
> tables are being loaded, how long is the queue and how far in the queue is a 
> particular table. 
> {code}
> I0920 11:18:35.363430 25757 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:18:35.363581 25757 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:18:46.076735 25760 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb2.t2
> I0920 11:18:46.076875 25760 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:18:46.272711 25753 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb2.t2
> I0920 11:18:46.272855 25753 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:18:48.300680 25752 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb2.t2
> I0920 11:18:48.300765 25758 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb2.t2
> I0920 11:18:48.301048 25752 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:18:48.301106 25758 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:09.699975 25762 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:09.700096 25762 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:12.123028 25755 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:12.123165 25755 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:12.986537 25763 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:12.986656 25763 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:12.986994 25766 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:12.987102 25766 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:14.211904 25759 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:14.212038 25759 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:16.055563 25767 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> I0920 11:19:16.055711 25767 TableLoadingMgr.java:287] Remaining items in 
> queue: 0. Loads in progress: 0
> I0920 11:19:16.059411 25754 TableLoadingMgr.java:285] Loading next table from 
> queue: mydb.t1
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6876:
-

Assignee: (was: Dimitris Tsirogiannis)

> Entries in CatalogUsageMonitor are not cleared after invalidation
> -
>
> Key: IMPALA-6876
> URL: https://issues.apache.org/jira/browse/IMPALA-6876
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: memory-leak
>
> The CatalogUsageMonitor in the catalog maintains a small cache of references 
> to tables that: a) are accessed frequently in the catalog and b) have the 
> highest memory requirements. These entries are not cleared upon server or 
> table invalidation, thus preventing the GC from collecting the memory of 
> these tables. We should make sure that the CatalogUsageMonitor does not 
> maintain entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6853) COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6853:
-

Assignee: (was: Dimitris Tsirogiannis)

> COMPUTE STATS does an unnecessary REFRESH after writing to the Metastore
> 
>
> Key: IMPALA-6853
> URL: https://issues.apache.org/jira/browse/IMPALA-6853
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.9.0, Impala 2.10.0, Impala 2.11.0, Impala 2.12.0
>Reporter: Alexander Behm
>Priority: Critical
>  Labels: compute-stats, perfomance
>
> COMPUTE STATS and possibly other DDL operations unnecessarily do the 
> equivalent of a REFRESH after writing to the Hive Metastore. This unnecessary 
> operation can be very expensive, so should be avoided.
> The behavior can be confirmed from the catalogd logs:
> {code}
> compute stats functional_parquet.alltypes;
> +---+
> | summary   |
> +---+
> | Updated 24 partition(s) and 11 column(s). |
> +---+
> Relevant catalogd.INFO snippet
> I0413 14:40:24.210749 27295 HdfsTable.java:1263] Incrementally loading table 
> metadata for: functional_parquet.alltypes
> I0413 14:40:24.242122 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.244634 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=10: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.247174 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=11: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.249713 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=12: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.252288 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=2: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.254629 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=3: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.256991 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=4: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.259464 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=5: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.262197 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=6: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.264463 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=7: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.266736 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=8: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.269210 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2009/month=9: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.271800 27295 HdfsTable.java:555] Refreshed file metadata for 
> functional_parquet.alltypes Path: 
> hdfs://localhost:20500/test-warehouse/alltypes_parquet/year=2010/month=1: 
> Loaded files: 1 Hidden files: 0 Skipped files: 0 Unknown diskIDs: 0
> I0413 14:40:24.274348 27295 

[jira] [Assigned] (IMPALA-6671) Metadata operations that modify a table blocks topic updates for other unrelated operations

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6671:
-

Assignee: (was: Dimitris Tsirogiannis)

> Metadata operations that modify a table blocks topic updates for other 
> unrelated operations
> ---
>
> Key: IMPALA-6671
> URL: https://issues.apache.org/jira/browse/IMPALA-6671
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 2.12.0
>Reporter: Mostafa Mokhtar
>Priority: Critical
>  Labels: catalog-server, perfomance
>
> Metadata operations that mutate the state of a table like "compute stats foo" 
> or "alter recover partitions" block topic updates for read only operations 
> against unrelated tables as "describe bar".
> Thread for blocked operation
> {code}
> "Thread-7" prio=10 tid=0x11613000 nid=0x21b3b waiting on condition 
> [0x7f5f2ef52000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x7f6f57ff0240> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
> at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.addTableToCatalogDeltaHelper(CatalogServiceCatalog.java:639)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.addTableToCatalogDelta(CatalogServiceCatalog.java:611)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.addDatabaseToCatalogDelta(CatalogServiceCatalog.java:567)
> at 
> org.apache.impala.catalog.CatalogServiceCatalog.getCatalogDelta(CatalogServiceCatalog.java:449)
> at 
> org.apache.impala.service.JniCatalog.getCatalogDelta(JniCatalog.java:126)
> {code}
> Thread for blocking operation 
> {code}
> "Thread-130" prio=10 tid=0x113d5800 nid=0x2499d runnable 
> [0x7f5ef80d]
>java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:152)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> - locked <0x7f5fffcd9f18> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at 
> org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_partitions_req(ThriftHiveMetastore.java:1639)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.add_partitions_req(ThriftHiveMetastore.java:1626)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.add_partitions(HiveMetaStoreClient.java:609)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> 

[jira] [Assigned] (IMPALA-6843) Responses to prioritizedLoad() requests should be returned directly and not via the statestore

2018-05-30 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6843:
-

Assignee: (was: Dimitris Tsirogiannis)

> Responses to prioritizedLoad() requests should be returned directly and not 
> via the statestore
> --
>
> Key: IMPALA-6843
> URL: https://issues.apache.org/jira/browse/IMPALA-6843
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.11.0
>Reporter: Dimitris Tsirogiannis
>Priority: Major
>  Labels: catalog, frontend, latency, perfomance
>
> Currently, when a statement (e.g. SELECT) needs to access some unloaded 
> tables, it issues a prioritizedLoad() request to the catalog. The catalog 
> loads the table metadata but does not respond directly to the coordinator 
> that issued the request. Instead, the metadata for the newly loaded tables 
> are broadcast via the statestore. The problem with this approach is that the 
> latency of the response may vary significantly and may depend on the 
> latencies of other unrelated metadata operations (e.g. REFRESH) that happen 
> to be in the same topic update.
> The response to a prioritizedLoad() request should come directly to the 
> issuing coordinator. Other coordinators will receive the metadata of the 
> newly loaded table via the statestore. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Assigned] (IMPALA-6900) Invalidate metadata operation is ignored at a coordinator if catalog is empty

2018-05-29 Thread Dimitris Tsirogiannis (JIRA)


 [ 
https://issues.apache.org/jira/browse/IMPALA-6900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis reassigned IMPALA-6900:
-

Assignee: Vuk Ercegovac  (was: Dimitris Tsirogiannis)

> Invalidate metadata operation is ignored at a coordinator if catalog is empty
> -
>
> Key: IMPALA-6900
> URL: https://issues.apache.org/jira/browse/IMPALA-6900
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Vuk Ercegovac
>Priority: Major
>
> The following workflow may cause an impalad that issued an invalidate 
> metadata to falsely consider that the effect of that operation has taken 
> effect, thus causing subsequent queries to fail due to unresolved references 
> to tables or databases. 
> Steps to reproduce:
>  # Start an impala cluster connecting to an empty HMS (no databases).
>  # Create a database "db" in HMS outside of Impala (e.g. using Hive).
>  # Run INVALIDATE METADATA through Impala.
>  # Run "use db" statement in Impala.
>  
> The while condition in the code snippet below is cause the 
> WaitForMinCatalogUpdate function to prematurely return even though INVALIDATE 
> METADATA has not taken effect: 
> {code:java}
> void ImpalaServer::WaitForMinCatalogUpdate(..) {
> ...
> VLOG_QUERY << "Waiting for minimum catalog object version: "
><< min_req_catalog_object_version << " current version: "
><< min_catalog_object_version;
> while (catalog_update_info_.min_catalog_object_version <  
> min_req_catalog_object_version && catalog_update_info_.catalog_service_id ==  
> catalog_service_id) {
>catalog_version_update_cv_.Wait(unique_lock);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7073:
-

 Summary: Failed test: 
query_test.test_scanners.TestScannerReservation.test_scanners
 Key: IMPALA-7073
 URL: https://issues.apache.org/jira/browse/IMPALA-7073
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


Possibly flaky test: 
{code:java}
Stacktrace
query_test/test_scanners.py:1064: in test_scanners
self.run_test_case('QueryTest/scanner-reservation', vector)
common/impala_test_suite.py:451: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:590: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7073) Failed test: query_test.test_scanners.TestScannerReservation.test_scanners

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7073:
-

 Summary: Failed test: 
query_test.test_scanners.TestScannerReservation.test_scanners
 Key: IMPALA-7073
 URL: https://issues.apache.org/jira/browse/IMPALA-7073
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


Possibly flaky test: 
{code:java}
Stacktrace
query_test/test_scanners.py:1064: in test_scanners
self.run_test_case('QueryTest/scanner-reservation', vector)
common/impala_test_suite.py:451: in run_test_case
verify_runtime_profile(test_section['RUNTIME_PROFILE'], 
result.runtime_profile)
common/test_result_verifier.py:590: in verify_runtime_profile
actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex:.*InitialRangeActualReservation.*Avg: 4.00 MB.*{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-7070) Failed test: query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7070:
-

 Summary: Failed test: 
query_test.test_nested_types.TestParquetArrayEncodings.test_thrift_array_of_arrays
 on S3
 Key: IMPALA-7070
 URL: https://issues.apache.org/jira/browse/IMPALA-7070
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


 
{code:java}
Error Message

query_test/test_nested_types.py:406: in test_thrift_array_of_arrays "col1 
array") query_test/test_nested_types.py:579: in _create_test_table  
   check_call(["hadoop", "fs", "-put", local_path, location], shell=False) 
/usr/lib64/python2.6/subprocess.py:505: in check_call raise 
CalledProcessError(retcode, cmd) E   CalledProcessError: Command '['hadoop', 
'fs', '-put', 
'/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
 returned non-zero exit status 1

Stacktrace

query_test/test_nested_types.py:406: in test_thrift_array_of_arrays
"col1 array")
query_test/test_nested_types.py:579: in _create_test_table
check_call(["hadoop", "fs", "-put", local_path, location], shell=False)
/usr/lib64/python2.6/subprocess.py:505: in check_call
raise CalledProcessError(retcode, cmd)
E   CalledProcessError: Command '['hadoop', 'fs', '-put', 
'/data/jenkins/workspace/impala-asf-2.x-core-s3/repos/Impala/testdata/parquet_nested_types_encodings/bad-thrift.parquet',
 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays']'
 returned non-zero exit status 1

Standard Error

SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_thrift_array_of_arrays_11da5fde` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_thrift_array_of_arrays_11da5fde`;

MainThread: Created database "test_thrift_array_of_arrays_11da5fde" for test ID 
"query_test/test_nested_types.py::TestParquetArrayEncodings::()::test_thrift_array_of_arrays[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none]"
-- executing against localhost:21000
create table test_thrift_array_of_arrays_11da5fde.ThriftArrayOfArrays (col1 
array) stored as parquet location 
's3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays';

18/05/20 18:31:03 WARN impl.MetricsConfig: Cannot locate configuration: tried 
hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
18/05/20 18:31:03 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 
second(s).
18/05/20 18:31:03 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
started
18/05/20 18:31:06 INFO Configuration.deprecation: 
fs.s3a.server-side-encryption-key is deprecated. Instead, use 
fs.s3a.server-side-encryption.key
put: rename 
`s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet._COPYING_'
 to 
`s3a://impala-cdh5-s3-test/test-warehouse/test_thrift_array_of_arrays_11da5fde.db/ThriftArrayOfArrays/bad-thrift.parquet':
 Input/output error
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: Stopping s3a-file-system metrics 
system...
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
stopped.
18/05/20 18:31:08 INFO impl.MetricsSystemImpl: s3a-file-system metrics system 
shutdown complete.{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7068) Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7068:
-

 Summary: Failed test: 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 on S3
 Key: IMPALA-7068
 URL: https://issues.apache.org/jira/browse/IMPALA-7068
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog, Infrastructure
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


This is from executing the failed test. It seems that the S3 prefix 
(s3a://impala-cdh5-s3-tests) is added twice to the table location, resulting in 
an invalid S3 path. 
{code:java}
Error Message
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression   
  FQ_TBL_NAME, TBL_LOCATION)) common/impala_connection.py:160: in execute 
return self.__beeswax_client.execute(sql_stmt, user=user) 
beeswax/impala_beeswax.py:173: in execute handle = 
self.__execute_query(query_string.strip(), user=user) 
beeswax/impala_beeswax.py:339: in __execute_query handle = 
self.execute_query_async(query_string, user=user) 
beeswax/impala_beeswax.py:335: in execute_query_async return 
self.__do_rpc(lambda: self.imp_service.query(query,)) 
beeswax/impala_beeswax.py:460: in __do_rpc raise 
ImpalaBeeswaxException(self.__build_error_message(b), b) E   
ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION:  EMESSAGE: AnalysisException: Bucket 
impala-cdh5-s3-tests3a does not exist E   CAUSED BY: FileNotFoundException: 
Bucket impala-cdh5-s3-tests3a does not exist
Stacktrace
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression
FQ_TBL_NAME, TBL_LOCATION))
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Bucket impala-cdh5-s3-tests3a does not exist
E   CAUSED BY: FileNotFoundException: Bucket impala-cdh5-s3-tests3a does not 
exist
Standard Error
-- connecting to: localhost:21000
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_unsupported_text_compression_695d360a` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_unsupported_text_compression_695d360a`;

MainThread: Created database "test_unsupported_text_compression_695d360a" for 
test ID 
"metadata/test_partition_metadata.py::TestPartitionMetadataUncompressedTextOnly::()::test_unsupported_text_compression[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]"
MainThread: Starting new HTTPS connection (1): 
impala-cdh5-s3-test.s3.amazonaws.com
-- executing against localhost:21000
create external table 
test_unsupported_text_compression_695d360a.multi_text_compression like 
functional.alltypes location 
's3a://impala-cdh5-s3-tests3a://impala-cdh5-s3-test/test-warehouse/test_unsupported_text_compression_695d360a.db/multi_text_compression';
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7068) Failed test: metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression on S3

2018-05-24 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7068:
-

 Summary: Failed test: 
metadata.test_partition_metadata.TestPartitionMetadataUncompressedTextOnly.test_unsupported_text_compression
 on S3
 Key: IMPALA-7068
 URL: https://issues.apache.org/jira/browse/IMPALA-7068
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog, Infrastructure
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis


This is from executing the failed test. It seems that the S3 prefix 
(s3a://impala-cdh5-s3-tests) is added twice to the table location, resulting in 
an invalid S3 path. 
{code:java}
Error Message
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression   
  FQ_TBL_NAME, TBL_LOCATION)) common/impala_connection.py:160: in execute 
return self.__beeswax_client.execute(sql_stmt, user=user) 
beeswax/impala_beeswax.py:173: in execute handle = 
self.__execute_query(query_string.strip(), user=user) 
beeswax/impala_beeswax.py:339: in __execute_query handle = 
self.execute_query_async(query_string, user=user) 
beeswax/impala_beeswax.py:335: in execute_query_async return 
self.__do_rpc(lambda: self.imp_service.query(query,)) 
beeswax/impala_beeswax.py:460: in __do_rpc raise 
ImpalaBeeswaxException(self.__build_error_message(b), b) E   
ImpalaBeeswaxException: ImpalaBeeswaxException: EINNER EXCEPTION:  EMESSAGE: AnalysisException: Bucket 
impala-cdh5-s3-tests3a does not exist E   CAUSED BY: FileNotFoundException: 
Bucket impala-cdh5-s3-tests3a does not exist
Stacktrace
metadata/test_partition_metadata.py:177: in test_unsupported_text_compression
FQ_TBL_NAME, TBL_LOCATION))
common/impala_connection.py:160: in execute
return self.__beeswax_client.execute(sql_stmt, user=user)
beeswax/impala_beeswax.py:173: in execute
handle = self.__execute_query(query_string.strip(), user=user)
beeswax/impala_beeswax.py:339: in __execute_query
handle = self.execute_query_async(query_string, user=user)
beeswax/impala_beeswax.py:335: in execute_query_async
return self.__do_rpc(lambda: self.imp_service.query(query,))
beeswax/impala_beeswax.py:460: in __do_rpc
raise ImpalaBeeswaxException(self.__build_error_message(b), b)
E   ImpalaBeeswaxException: ImpalaBeeswaxException:
EINNER EXCEPTION: 
EMESSAGE: AnalysisException: Bucket impala-cdh5-s3-tests3a does not exist
E   CAUSED BY: FileNotFoundException: Bucket impala-cdh5-s3-tests3a does not 
exist
Standard Error
-- connecting to: localhost:21000
SET sync_ddl=False;
-- executing against localhost:21000
DROP DATABASE IF EXISTS `test_unsupported_text_compression_695d360a` CASCADE;

SET sync_ddl=False;
-- executing against localhost:21000
CREATE DATABASE `test_unsupported_text_compression_695d360a`;

MainThread: Created database "test_unsupported_text_compression_695d360a" for 
test ID 
"metadata/test_partition_metadata.py::TestPartitionMetadataUncompressedTextOnly::()::test_unsupported_text_compression[exec_option:
 {'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: text/none]"
MainThread: Starting new HTTPS connection (1): 
impala-cdh5-s3-test.s3.amazonaws.com
-- executing against localhost:21000
create external table 
test_unsupported_text_compression_695d360a.multi_text_compression like 
functional.alltypes location 
's3a://impala-cdh5-s3-tests3a://impala-cdh5-s3-test/test-warehouse/test_unsupported_text_compression_695d360a.db/multi_text_compression';
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IMPALA-7058) Crash in exhaustive tests for rhel7

2018-05-22 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis updated IMPALA-7058:
--
Labels: broken-build crash  (was: )

> Crash in exhaustive tests for rhel7 
> 
>
> Key: IMPALA-7058
> URL: https://issues.apache.org/jira/browse/IMPALA-7058
> Project: IMPALA
>  Issue Type: Bug
>  Components: Backend
>Affects Versions: Impala 3.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Pranay Singh
>Priority: Blocker
>  Labels: broken-build, crash
>
> The backtrace is here:
> {code:java}
> #7 0x02d89a84 in 
> impala::DelimitedTextParser::ParseFieldLocations (this=0xcf539a0, 
> max_tuples=1, remaining_len=-102, byte_buffer_ptr=0x7fc6b764dad0, 
> row_end_locations=0x7fc6b764dac0, field_locations=0x10034000, 
> num_tuples=0x7fc6b764dacc, num_fields=0x7fc6b764dac8, 
> next_column_start=0x7fc6b764dad8) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/delimited-text-parser.cc:205
> #8 0x01fdb641 in impala::HdfsSequenceScanner::ProcessRange 
> (this=0x15515f80, row_batch=0xcf54800) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-sequence-scanner.cc:352
> #9 0x02d7a20e in impala::BaseSequenceScanner::GetNextInternal 
> (this=0x15515f80, row_batch=0xcf54800) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/base-sequence-scanner.cc:181
> #10 0x01fb1ff0 in impala::HdfsScanner::ProcessSplit (this=0x15515f80) 
> at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scanner.cc:134
> #11 0x01f89258 in impala::HdfsScanNode::ProcessSplit 
> (this=0x2a4a8800, filter_ctxs=..., expr_results_pool=0x7fc6b764e4b0, 
> scan_range=0x13f5f8700, scanner_thread_reservation=0x7fc6b764e428) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:453
> #12 0x01f885f9 in impala::HdfsScanNode::ScannerThread 
> (this=0x2a4a8800, first_thread=false, scanner_thread_reservation=32768) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:360
> #13 0x01f87a6c in impala::HdfsScanNode::::operator()(void) 
> const (__closure=0x7fc6b764ebe8) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:292
> #14 0x01f89ac8 in 
> boost::detail::function::void_function_obj_invoker0,
>  void>::invoke(boost::detail::function::function_buffer &) 
> (function_obj_ptr=...) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
> #15 0x01bf0b28 in boost::function0::operator() 
> (this=0x7fc6b764ebe0) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
> #16 0x01edc57f in impala::Thread::SuperviseThread(std::string const&, 
> std::string const&, boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*) (name=..., category=..., functor=..., 
> parent_thread_info=0x7fc6b9e53890, thread_started=0x7fc6b9e527c0) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:356
> #17 0x01ee471b in boost::_bi::list5 boost::_bi::value, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value >::operator() const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, 
> void (*&)(std::string const&, std::string const&, boost::function, 
> impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, 
> int) (this=0x2a370fc0, f=@0x2a370fb8: 0x1edc218 
>  boost::function, impala::ThreadDebugInfo const*, 
> impala::Promise*)>, a=...) at 
> /data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
> #18 0x01ee463f in boost::_bi::bind_t const&, std::string const&, boost::function, impala::ThreadDebugInfo 
> const*, impala::Promise*), 
> boost::_bi::list5 boost::_bi::value, boost::_bi::value, 
> boost::_bi::value, 
> boost::_bi::value > >::operator()() (this=0x2a370fb8) 
> at 
> 

[jira] [Created] (IMPALA-7058) Crash in exhaustive tests for rhel7

2018-05-22 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7058:
-

 Summary: Crash in exhaustive tests for rhel7 
 Key: IMPALA-7058
 URL: https://issues.apache.org/jira/browse/IMPALA-7058
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Affects Versions: Impala 3.0
Reporter: Dimitris Tsirogiannis
Assignee: Pranay Singh


The backtrace is here:
{code:java}
#7 0x02d89a84 in 
impala::DelimitedTextParser::ParseFieldLocations (this=0xcf539a0, 
max_tuples=1, remaining_len=-102, byte_buffer_ptr=0x7fc6b764dad0, 
row_end_locations=0x7fc6b764dac0, field_locations=0x10034000, 
num_tuples=0x7fc6b764dacc, num_fields=0x7fc6b764dac8, 
next_column_start=0x7fc6b764dad8) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/delimited-text-parser.cc:205
#8 0x01fdb641 in impala::HdfsSequenceScanner::ProcessRange 
(this=0x15515f80, row_batch=0xcf54800) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-sequence-scanner.cc:352
#9 0x02d7a20e in impala::BaseSequenceScanner::GetNextInternal 
(this=0x15515f80, row_batch=0xcf54800) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/base-sequence-scanner.cc:181
#10 0x01fb1ff0 in impala::HdfsScanner::ProcessSplit (this=0x15515f80) 
at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scanner.cc:134
#11 0x01f89258 in impala::HdfsScanNode::ProcessSplit (this=0x2a4a8800, 
filter_ctxs=..., expr_results_pool=0x7fc6b764e4b0, scan_range=0x13f5f8700, 
scanner_thread_reservation=0x7fc6b764e428) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:453
#12 0x01f885f9 in impala::HdfsScanNode::ScannerThread (this=0x2a4a8800, 
first_thread=false, scanner_thread_reservation=32768) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:360
#13 0x01f87a6c in impala::HdfsScanNode::::operator()(void) 
const (__closure=0x7fc6b764ebe8) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/exec/hdfs-scan-node.cc:292
#14 0x01f89ac8 in 
boost::detail::function::void_function_obj_invoker0,
 void>::invoke(boost::detail::function::function_buffer &) 
(function_obj_ptr=...) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#15 0x01bf0b28 in boost::function0::operator() 
(this=0x7fc6b764ebe0) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#16 0x01edc57f in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) (name=..., category=..., functor=..., 
parent_thread_info=0x7fc6b9e53890, thread_started=0x7fc6b9e527c0) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/repos/Impala/be/src/util/thread.cc:356
#17 0x01ee471b in boost::_bi::list5, 
boost::_bi::value, 
boost::_bi::value >::operator(), impala::ThreadDebugInfo 
const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, 
void (*&)(std::string const&, std::string const&, boost::function, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, 
int) (this=0x2a370fc0, f=@0x2a370fb8: 0x1edc218 
, a=...) at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
#18 0x01ee463f in boost::_bi::bind_t, 
boost::_bi::value, 
boost::_bi::value > >::operator()() (this=0x2a370fb8) 
at 
/data/jenkins/workspace/impala-asf-master-exhaustive-rhel7/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#19 0x01ee4602 in boost::detail::thread_data, 
boost::_bi::value, 
boost::_bi::value > > 

[jira] [Created] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables

2018-05-18 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7048:
-

 Summary: Failed test: 
query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
 Key: IMPALA-7048
 URL: https://issues.apache.org/jira/browse/IMPALA-7048
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Dimitris Tsirogiannis
Assignee: Zoltán Borók-Nagy


The following test fails when the filesystem is LOCAL:
{code:java}
query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option:
 \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
pytest) {code}
Zoltan, assigning to you since this looks suspiciously related to the fix for 
IMPALA-5842. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-7048) Failed test: query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables

2018-05-18 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-7048:
-

 Summary: Failed test: 
query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables
 Key: IMPALA-7048
 URL: https://issues.apache.org/jira/browse/IMPALA-7048
 Project: IMPALA
  Issue Type: Bug
  Components: Backend
Reporter: Dimitris Tsirogiannis
Assignee: Zoltán Borók-Nagy


The following test fails when the filesystem is LOCAL:
{code:java}
query_test.test_parquet_page_index.TestHdfsParquetTableIndexWriter.test_write_index_many_columns_tables[exec_option:
 \{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
pytest) {code}
Zoltan, assigning to you since this looks suspiciously related to the fix for 
IMPALA-5842. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6948) Coordinators don't detect the deletion of tables that occurred outside of impala after catalog restart

2018-05-11 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-6948.
---
Resolution: Fixed

> Coordinators don't detect the deletion of tables that occurred outside of 
> impala after catalog restart
> --
>
> Key: IMPALA-6948
> URL: https://issues.apache.org/jira/browse/IMPALA-6948
> Project: IMPALA
>  Issue Type: Bug
>  Components: Catalog
>Affects Versions: Impala 3.0, Impala 2.12.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Blocker
>  Labels: catalog-server
>
> Upon catalog restart the coordinators detect this event and request a full 
> topic update from the statestore. In certain cases, the topic update protocol 
> executed between the statestore and the catalog fails to detect catalog 
> objects that were deleted from the Metastore externally (e.g. via HIVE), thus 
> causing these objects to show up again in each coordinator's catalog cache. 
> The end result is that the catalog server and the coordinator's cache are out 
> of sync and in some cases the only solution is to restart both the catalog 
> and the statestore. 
> The following sequence can reproduce this issue:
> {code:java}
> impala> create table lala(int a);
> bash> kill -9 `pidof catalogd`
> hive> drop table lala;
> bash> restart catalogd 
> impala> show tables;
> --- lala shows up in the list of tables;{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6948) Coordinators don't detect the deletion of tables that occurred outside of impala after catalog restart

2018-04-27 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-6948:
-

 Summary: Coordinators don't detect the deletion of tables that 
occurred outside of impala after catalog restart
 Key: IMPALA-6948
 URL: https://issues.apache.org/jira/browse/IMPALA-6948
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Dimitris Tsirogiannis
Assignee: Dimitris Tsirogiannis


Upon catalog restart the coordinators detect this event and request a full 
topic update from the statestore. In certain cases, the topic update protocol 
executed between the statestore and the catalog fails to detect catalog objects 
that were deleted from the Metastore externally (e.g. via HIVE), thus causing 
these objects to show up again in each coordinator's catalog cache. The end 
result is that the catalog server and the coordinator's cache are out of sync 
and in some cases the only solution is to restart both the catalog and the 
statestore. 

The following sequence can reproduce this issue:
{code:java}
impala> create table lala(int a);
bash> kill -9 `pidof catalogd`
hive> drop table lala;
bash> restart catalogd 
impala> show tables;
--- lala shows up in the list of tables;{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org



[jira] [Created] (IMPALA-6901) Investigate memory consumption of table metrics in the catalog

2018-04-20 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-6901:
-

 Summary: Investigate memory consumption of table metrics in the 
catalog
 Key: IMPALA-6901
 URL: https://issues.apache.org/jira/browse/IMPALA-6901
 Project: IMPALA
  Issue Type: Improvement
  Components: Catalog
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Dimitris Tsirogiannis


IMPALA-4886 introduced the concept of per-table metrics. In some cases (catalog 
with 90K tables) it has been reported that the table metrics can consume almost 
30% of catalog heap size. We should perform the following actions:
 # Measure the impact of table metrics on memory usage.
 # Tune/optimize table metrics to reduce their memory requirements. Some quick 
fixes/ideas may include: a) replace/tune histogram based metrics with simpler 
ones, b) eliminate metrics that are not considered particularly useful, c) 
store detailed metrics only for "interesting" tables (e.g. large tables that 
are heavily used).  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6900) Invalidate metadata operation is ignored at a coordinator if catalog is empty

2018-04-20 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-6900:
-

 Summary: Invalidate metadata operation is ignored at a coordinator 
if catalog is empty
 Key: IMPALA-6900
 URL: https://issues.apache.org/jira/browse/IMPALA-6900
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Dimitris Tsirogiannis
Assignee: Dimitris Tsirogiannis


The following workflow may cause an impalad that issued an invalidate metadata 
to falsely consider that the effect of that operation has taken effect, thus 
causing subsequent queries to fail due to unresolved references to tables or 
databases. 

Steps to reproduce:
 # Start an impala cluster connecting to an empty HMS (no databases).
 # Create a database "db" in HMS outside of Impala (e.g. using Hive).
 # Run INVALIDATE METADATA through Impala.
 # Run "use db" statement in Impala.

 

The while condition in the code snippet below is cause the 
WaitForMinCatalogUpdate function to prematurely return even though INVALIDATE 
METADATA has not taken effect: 
{code:java}
void ImpalaServer::WaitForMinCatalogUpdate(..) {
...
VLOG_QUERY << "Waiting for minimum catalog object version: "
   << min_req_catalog_object_version << " current version: "
   << min_catalog_object_version;
while (catalog_update_info_.min_catalog_object_version <  
min_req_catalog_object_version && catalog_update_info_.catalog_service_id ==  
catalog_service_id) {
   catalog_version_update_cv_.Wait(unique_lock);
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6876) Entries in CatalogUsageMonitor are not cleared after invalidation

2018-04-18 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-6876:
-

 Summary: Entries in CatalogUsageMonitor are not cleared after 
invalidation
 Key: IMPALA-6876
 URL: https://issues.apache.org/jira/browse/IMPALA-6876
 Project: IMPALA
  Issue Type: Bug
  Components: Catalog
Affects Versions: Impala 3.0, Impala 2.12.0
Reporter: Dimitris Tsirogiannis
Assignee: Dimitris Tsirogiannis


The CatalogUsageMonitor in the catalog maintains a small cache of references to 
tables that: a) are accessed frequently in the catalog and b) have the highest 
memory requirements. These entries are not cleared upon server or table 
invalidation, thus preventing the GC from collecting the memory of these 
tables. We should make sure that the CatalogUsageMonitor does not maintain 
entries of tables that have been invalidated or deleted. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-6424) REFRESH right after invalidate metadata loads file metadata twice

2018-02-21 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-6424.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

Change-Id: Ie41a734493dcea0e36d6b051966f1d0302907dee
Reviewed-on:

[http://gerrit.cloudera.org:8080/9224]


Reviewed-by: Dimitris Tsirogiannis <

[dtsirogian...@cloudera.com|mailto:dtsirogian...@cloudera.com]

>
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
1 file changed, 23 insertions(+), 5 deletions(-)

> REFRESH right after invalidate metadata  loads file metadata twice
> -
>
> Key: IMPALA-6424
> URL: https://issues.apache.org/jira/browse/IMPALA-6424
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Reporter: Juan Yu
>Assignee: Dimitris Tsirogiannis
>Priority: Critical
> Fix For: Impala 2.12.0
>
>
> Compare with normal REFRESH, REFRESH right after Invalidate metadata  
> load file metadata twice and takes 2x time. The second refresh seems 
> redundant.
> I0119 07:46:41.107390 26758 CatalogServiceCatalog.java:1518] Invalidating 
> table metadata: s3.catalog_sales
> I0119 07:46:43.002053 26309 catalog-server.cc:331] Publishing update : 
> TABLE:s3.catalog_sales@1166
> I0119 07:46:43.002068 26309 catalog-server.cc:331] Publishing update : 
> CATALOG:b0f520a5e2ab4056:b7e2e045fa39d625@1166
> I0119 07:46:46.696725 26758 TableLoadingMgr.java:70] Loading metadata for 
> table: s3.catalog_sales
> I0119 07:46:46.696781 26758 TableLoadingMgr.java:72] Remaining items in 
> queue: 0. Loads in progress: 1
> I0119 07:46:46.696857 27023 TableLoader.java:58] Loading metadata for: 
> s3.catalog_sales
> I0119 07:46:46.713222 27023 HdfsTable.java:1206] Fetching partition metadata 
> from the Metastore: s3.catalog_sales
> I0119 07:46:46.905102 27023 HdfsTable.java:1210] Fetched partition metadata 
> from the Metastore: s3.catalog_sales
>  *I0119 07:46:46.939254 27023 HdfsTable.java:834] Loading file and block 
> metadata for 1837 paths for table s3.catalog_sales using a thread pool of 
> size 20*
> I0119 07:47:00.426975 27023 HdfsTable.java:874] Loaded file and block 
> metadata for s3.catalog_sales
> I0119 07:47:00.427062 27023 TableLoader.java:97] Loaded metadata for: 
> s3.catalog_sales
> I0119 07:47:00.427243 26758 CatalogServiceCatalog.java:1433] Refreshing table 
> metadata: s3.catalog_sales
> I0119 07:47:00.441572 26758 HdfsTable.java:1193] Incrementally loading table 
> metadata for: s3.catalog_sales
>  *I0119 07:47:00.456437 26758 HdfsTable.java:834] Loading file and block 
> metadata for 1837 paths for table s3.catalog_sales using a thread pool of 
> size 20*
> I0119 07:47:14.038097 26758 HdfsTable.java:874] Loaded file and block 
> metadata for s3.catalog_sales
> I0119 07:47:14.038132 26758 HdfsTable.java:1203] Incrementally loaded table 
> metadata for: s3.catalog_sales
> I0119 07:47:14.038179 26758 CatalogServiceCatalog.java:1456] Refreshed table 
> metadata: s3.catalog_sales
> I0119 07:47:14.062625 26309 catalog-server.cc:331] Publishing update : 
> TABLE:s3.catalog_sales@1168
> I0119 07:47:14.062645 26309 catalog-server.cc:331] Publishing update : 
> CATALOG:b0f520a5e2ab4056:b7e2e045fa39d625@1168



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IMPALA-6447) Failure in tests.stress.concurrent_select

2018-01-29 Thread Dimitris Tsirogiannis (JIRA)
Dimitris Tsirogiannis created IMPALA-6447:
-

 Summary: Failure in tests.stress.concurrent_select
 Key: IMPALA-6447
 URL: https://issues.apache.org/jira/browse/IMPALA-6447
 Project: IMPALA
  Issue Type: Bug
  Components: Infrastructure
Reporter: Dimitris Tsirogiannis
Assignee: Michael Brown


{code:java}

../infra/python/env/lib/python2.6/site-packages/_pytest/python.py:611: in 
_importtestmodule
mod = self.fspath.pyimport(ensuresyspath=importmode)
../infra/python/env/lib/python2.6/site-packages/py/_path/local.py:662: in 
pyimport
__import__(modname)
../infra/python/env/lib/python2.6/site-packages/_pytest/assertion/rewrite.py:171:
 in load_module
py.builtin.exec_(co, mod.__dict__)
metadata/test_explain.py:26: in 
from tests.stress.concurrent_select import match_memory_estimate
E File 
"/data/jenkins/workspace/impala-asf-master-core/repos/Impala/tests/stress/concurrent_select.py",
 line 1644
E   tables = {t: cursor.describe_table(t) for t in 
cursor.list_table_names()}
E   ^
E   SyntaxError: invalid syntax{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-4886) Expose per table partition/files/blocks count in web UI

2018-01-19 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-4886.
---
   Resolution: Fixed
Fix Version/s: Impala 3.0

Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e
Reviewed-on:

[http://gerrit.cloudera.org:8080/8529]


Reviewed-by: Dimitris Tsirogiannis <

[dtsirogian...@cloudera.com|mailto:dtsirogian...@cloudera.com]

>
Tested-by: Impala Public Jenkins
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M common/thrift/CatalogObjects.thrift
M common/thrift/Frontend.thrift
M common/thrift/JniCatalog.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
A fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/common/Metrics.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
A fe/src/main/java/org/apache/impala/util/TopNCache.java
A fe/src/test/java/org/apache/impala/util/TestTopNCache.java
M tests/webserver/test_web_pages.py
M www/catalog.tmpl
A www/table_metrics.tmpl
24 files changed, 1,206 insertions(+), 113 deletions(-)

> Expose per table partition/files/blocks count in web UI
> ---
>
> Key: IMPALA-4886
> URL: https://issues.apache.org/jira/browse/IMPALA-4886
> Project: IMPALA
>  Issue Type: Sub-task
>  Components: Catalog
>Affects Versions: Impala 2.8.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Major
>  Labels: usability
> Fix For: Impala 3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-5058) Improve concurrency of DDL/DML operations during catalog updates

2018-01-16 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-5058.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

Change-Id: If12467a83acaeca6a127491d89291dedba91a35a
Reviewed-on: [http://gerrit.cloudera.org:8080/7731]
Reviewed-by: Dimitris Tsirogiannis 
<[dtsirogian...@cloudera.com|mailto:dtsirogian...@cloudera.com]>
Tested-by: Impala Public Jenkins
Reviewed-on:

[http://gerrit.cloudera.org:8080/8752]
 

---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalog-util.cc
M be/src/catalog/catalog-util.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/service/client-request-state.cc
M be/src/service/frontend.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M common/thrift/CatalogInternalService.thrift
M common/thrift/CatalogService.thrift
M common/thrift/Frontend.thrift
M common/thrift/StatestoreService.thrift
M fe/src/main/java/org/apache/impala/catalog/AuthorizationPolicy.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObject.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObjectCache.java
A fe/src/main/java/org/apache/impala/catalog/CatalogObjectImpl.java
A fe/src/main/java/org/apache/impala/catalog/CatalogObjectVersionQueue.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/DataSource.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/HdfsCachePool.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Role.java
M fe/src/main/java/org/apache/impala/catalog/RolePrivilege.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
A fe/src/main/java/org/apache/impala/catalog/TopicUpdateLog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M fe/src/main/java/org/apache/impala/util/SentryProxy.java
M fe/src/test/java/org/apache/impala/testutil/CatalogServiceTestCatalog.java
M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java
M tests/statestore/test_statestore.py
46 files changed, 1,776 insertions(+), 860 deletions(-)

> Improve concurrency of DDL/DML operations during catalog updates
> 
>
> Key: IMPALA-5058
> URL: https://issues.apache.org/jira/browse/IMPALA-5058
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Catalog
>Affects Versions: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0
>Reporter: Dimitris Tsirogiannis
>Assignee: Dimitris Tsirogiannis
>Priority: Critical
>  Labels: catalog-server, performance, usability
> Fix For: Impala 2.12.0
>
> Attachments: sample-refresh-duration-graph.png
>
>
> Currently, long running DDL/DML operations can block other operations from 
> making progress if they run concurrently with the getCatalogObjects() call 
> that creates catalog updates. The reason is that while getCatalogObjects() 
> holds the lock for its entire duration and also tries to acquire the locks 
> for the tables it processes. If that operation is blocked by another 
> operation on a table then any other, unrelated, catalog write operation 
> cannot make any progress as it cannot acquire the catalog lock which is held 
> by getCatalogObjects().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (IMPALA-4168) Adopt Oracle-style hint placement for INSERT statements

2018-01-10 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-4168.
---
   Resolution: Fixed
Fix Version/s: Impala 2.12.0

Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20
Reviewed-on: http://gerrit.cloudera.org:8080/8676
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test
8 files changed, 265 insertions(+), 84 deletions(-)

> Adopt Oracle-style hint placement for INSERT statements
> ---
>
> Key: IMPALA-4168
> URL: https://issues.apache.org/jira/browse/IMPALA-4168
> Project: IMPALA
>  Issue Type: Improvement
>  Components: Frontend
>Affects Versions: Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, 
> Impala 2.6.0, Impala 2.7.0
>Reporter: Alexander Behm
>Assignee: Jinchul Kim
>  Labels: incompatibility, ramp-up
> Fix For: Impala 2.12.0
>
>
> For consistency with Oracle we should consider accepting hints in the same 
> places in SQL statements. For example, our current INSERT statements accepts 
> hints right before the SELECT portion:
> {code}
> INSERT INTO t PARTITIONED(year,month) /*+ hints */ SELECT * FROM src;
> {code}
> The proposal is to accept hints immediately after INSERT like Oracle does:
> {code}
> INSERT /*+ hints */ INTO t PARTITIONED(year,month) SELECT * FROM src;
> {code}
> Ideally, we would not accept hints in multiple places to avoid confusion and 
> to reduce the code and testing burden. Ceasing to recognize the old hint 
> placement is a backwards incompatible change.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (IMPALA-6001) Integration job failed in TestDdlStatements.test_functions_ddl - one extra function in actual output

2017-11-29 Thread Dimitris Tsirogiannis (JIRA)

 [ 
https://issues.apache.org/jira/browse/IMPALA-6001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitris Tsirogiannis resolved IMPALA-6001.
---
Resolution: Fixed

Change-Id: I3a2cddee5d565384e9de0e61b3b7d0d9075e0dce
Reviewed-on: http://gerrit.cloudera.org:8080/8667
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Impala Public Jenkins
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalog-util.cc
M be/src/catalog/catalog-util.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M common/thrift/CatalogInternalService.thrift
M common/thrift/StatestoreService.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogDeltaLog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M tests/statestore/test_statestore.py
21 files changed, 441 insertions(+), 480 deletions(-)

> Integration job failed in TestDdlStatements.test_functions_ddl - one extra 
> function in actual output
> 
>
> Key: IMPALA-6001
> URL: https://issues.apache.org/jira/browse/IMPALA-6001
> Project: IMPALA
>  Issue Type: Bug
>  Components: Frontend
>Affects Versions: Impala 2.10.0
>Reporter: Alexander Behm
>Assignee: Dimitris Tsirogiannis
>Priority: Blocker
>  Labels: broken-build
>
> TestDdlStatements.test_functions_ddl failed.
> What's weird is that the following function was already deleted and seems to 
> have re-appeared:
> {code}
> 'INT','fn()','NATIVE','true'
> {code}
> Relevant Jenkins snippet below:
> {code}
> 15:35:55.450 === FAILURES 
> ===
> 15:35:55.450  TestDdlStatements.test_functions_ddl[exec_option: {'sync_ddl': 
> 1, 'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 
> 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none-unique_database0] 
> 15:35:55.450 [gw3] linux2 -- Python 2.6.6 
> /data/jenkins/workspace/impala-asf-master-exhaustive-integration/repos/Impala/bin/../infra/python/env/bin/python
> 15:35:55.450 metadata/test_ddl.py:324: in test_functions_ddl
> 15:35:55.450 multiple_impalad=self._use_multiple_impalad(vector))
> 15:35:55.450 common/impala_test_suite.py:420: in run_test_case
> 15:35:55.450 self.__verify_results_and_errors(vector, test_section, 
> result, use_db)
> 15:35:55.450 common/impala_test_suite.py:293: in __verify_results_and_errors
> 15:35:55.450 replace_filenames_with_placeholder)
> 15:35:55.450 common/test_result_verifier.py:404: in verify_raw_results
> 15:35:55.450 VERIFIER_MAP[verifier](expected, actual)
> 15:35:55.450 common/test_result_verifier.py:231: in 
> verify_query_result_is_equal
> 15:35:55.450 assert expected_results == actual_results
> 15:35:55.450 E   assert Comparing QueryTestResults (expected vs actual):
> 15:35:55.450 E 'DOUBLE','fn(INT)','NATIVE','true' == 
> 'DOUBLE','fn(INT)','NATIVE','true'
> 15:35:55.450 E 'INT','fn(INT, STRING)','NATIVE','true' != 
> 'INT','fn()','NATIVE','true'
> 15:35:55.450 E 'INT','fn(STRING, INT)','NATIVE','true' != 'INT','fn(INT, 
> STRING)','NATIVE','true'
> 15:35:55.450 E 'INT','fn2(INT)','NATIVE','true' != 'INT','fn(STRING, 
> INT)','NATIVE','true'
> 15:35:55.450 E None != 'INT','fn2(INT)','NATIVE','true'
> 15:35:55.450 E Number of rows returned (expected vs actual): 4 != 5
> 15:35:55.450  Captured stderr setup 
> -
> 15:35:55.450 SET sync_ddl=True;
> 15:35:55.450 -- executing against localhost:21000
> 15:35:55.450 DROP DATABASE IF EXISTS `test_functions_ddl_baf0bb91` CASCADE;
> 15:35:55.450 
> 15:35:55.450 SET sync_ddl=True;
> 15:35:55.450 -- executing against localhost:21000
> 15:35:55.450 CREATE DATABASE `test_functions_ddl_baf0bb91`;
> 15:35:55.450 
> 15:35:55.450 MainThread: Created database "test_functions_ddl_baf0bb91" for 
> test ID 
> "metadata/test_ddl.py::TestDdlStatements::()::test_functions_ddl[exec_option: 
> {'sync_ddl': 1, 'batch_size': 0, 'num_nodes': 0, 
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False, 
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format: 
> text/none-unique_database0]"
> 15:35:55.450