[jira] [Created] (IMPALA-6652) KRPC : Data Stream Manager Deferred RPCs in memz page isn't correct
Mostafa Mokhtar created IMPALA-6652: --- Summary: KRPC : Data Stream Manager Deferred RPCs in memz page isn't correct Key: IMPALA-6652 URL: https://issues.apache.org/jira/browse/IMPALA-6652 Project: IMPALA Issue Type: Bug Components: Distributed Exec Affects Versions: Impala 2.11.0 Reporter: Mostafa Mokhtar Assignee: Lars Volker While loading data into a Kudu table against the latest Impala 2.11.0 I noticed that "Data Stream Manager Deferred RPCs" in the memz isn't accurate. >From memz on worker {code} Process: Limit=201.73 GB Total=85.41 GB Peak=85.41 GB Buffer Pool: Free Buffers: Total=43.64 MB Buffer Pool: Clean Pages: Total=0 Buffer Pool: Unused Reservation: Total=-17.84 MB Data Stream Service Queue: Limit=10.09 GB Total=0 Peak=512.97 MB Data Stream Manager Deferred RPCs: Total=0 Peak=0 TCMalloc Overhead: Total=124.07 MB Free Disk IO Buffers: Total=984.97 MB Peak=984.97 MB RequestPool=root.default: Total=83.92 GB Peak=83.92 GB Query(844a0200d7876345:20bb38b9): Reservation=70.44 GB ReservationLimit=161.39 GB OtherMemory=13.48 GB Total=83.92 GB Peak=83.92 GB Fragment 844a0200d7876345:20bb38b900a3: Reservation=70.44 GB OtherMemory=38.08 MB Total=70.47 GB Peak=70.47 GB SORT_NODE (id=2): Reservation=70.44 GB OtherMemory=8.00 KB Total=70.44 GB Peak=70.44 GB EXCHANGE_NODE (id=1): Reservation=18.06 MB OtherMemory=0 Total=18.06 MB Peak=19.53 MB KrpcDeferredRpcs: Total=0 Peak=1.47 MB KuduTableSink: Total=20.00 MB Peak=20.00 MB CodeGen: Total=438.00 B Peak=306.00 KB Fragment 844a0200d7876345:20bb38b90022: Reservation=0 OtherMemory=13.44 GB Total=13.44 GB Peak=13.97 GB HDFS_SCAN_NODE (id=0): Total=13.44 GB Peak=13.97 GB KrpcDataStreamSender (dst_id=1): Total=2.57 MB Peak=3.61 MB CodeGen: Total=234.00 B Peak=52.50 KB Untracked Memory: Total=389.18 MB {code} And snapshot from query profile {code} Instance 844a0200d7876345:20bb38b900a3 (host=va1030.halxg.cloudera.com:22000):(Total: 1s172ms, non-child: 200.411ms, % non-child: 17.09%) Fragment Instance Lifecycle Event Timeline: 1s173ms - Prepare Finished: 199.691ms (199.691ms) - Open Finished: 1s173ms (973.902ms) MemoryUsage(1m4s): 4.77 GB, 13.21 GB, 19.60 GB, 23.70 GB, 26.67 GB, 29.21 GB, 31.50 GB, 33.63 GB, 35.40 GB, 37.14 GB, 38.54 GB, 39.79 GB, 41.09 GB, 42.37 GB, 43.60 GB, 44.80 GB, 45.95 GB, 47.01 GB, 48.09 GB, 49.17 GB, 50.22 GB, 51.21 GB, 52.40 GB, 53.46 GB, 54.58 GB, 55.61 GB, 56.58 GB, 57.53 GB, 58.45 GB, 59.39 GB, 60.31 GB, 61.20 GB, 62.12 GB, 63.04 GB, 64.15 GB, 65.11 GB, 66.15 GB, 67.06 GB, 67.87 GB, 68.66 GB, 69.49 GB ThreadUsage(1m4s): 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 - AverageThreadTokens: 1.00 - BloomFilterBytes: 0 - PeakMemoryUsage: 70.47 GB (75663229366) - PeakReservation: 70.43 GB (75623301120) - PeakUsedReservation: 0 - PerHostPeakMemUsage: 83.91 GB (90098285809) - RowsProduced: 0 (0) - TotalNetworkReceiveTime: 34m43s - TotalNetworkSendTime: 0.000ns - TotalStorageWaitTime: 0.000ns - TotalThreadsInvoluntaryContextSwitches: 7 (7) - TotalThreadsTotalWallClockTime: 973.873ms - TotalThreadsSysTime: 2.000ms - TotalThreadsUserTime: 55.991ms - TotalThreadsVoluntaryContextSwitches: 25 (25) Buffer pool: - AllocTime: 0.000ns - CumulativeAllocationBytes: 0 - CumulativeAllocations: 0 (0) - PeakReservation: 0 - PeakUnpinnedBytes: 0 - PeakUsedReservation: 0 - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - ReservationLimit: 0 - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns Fragment Instance Lifecycle Timings: - ExecTime: 0.000ns - ExecTreeExecTime: 0.000ns - OpenTime: 973.876ms - ExecTreeOpenTime: 915.567ms - PrepareTime: 198.988ms - ExecTreePrepareTime: 155.134us KuduTableSink:(Total: 12.589us, non-child: 12.589us, % non-child: 100.00%) - KuduApplyTimer: 0.000ns - NumRowErrors: 0 (0) - PeakMemoryUsage: 20.00 MB (20971520) - RowsProcessedRate: 0 - TotalNumRows: 0 (0) SORT_NODE (id=2):(Total: 915.718ms, non-child: 0.000ns, % non-child: 0.00%) SortType: Partial ExecOption: Codegen Enabled - NumRowsPerRun: 0 (0) (Number of samples: 0) - InMemorySortTime: 0.000ns - PeakMemoryUsage: 70.43 GB (75623309312) - RowsReturned: 0 (0) - RowsReturnedRate: 0 - RunsCreated: 1 (1) - SortDataSize: 0 Buffer pool: - AllocTime: 2m47s - CumulativeAllocationBytes: 70.43 GB (75623301120) - CumulativeAllocations: 36.06K (36060) - PeakReservation: 70.43 GB (75623301120) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 70.43 GB (75623301120) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns EXCHANGE_NODE (id=1):(Total: 34m53s, non-child: 17s052ms, % non-child: 0.81%) - ConvertRowBatchTime: 7s479ms - PeakMemoryUsage: 19.53 MB (20481319) - RowsReturned: 276.19M (276187128)
[jira] [Moved] (IMPALA-6653) Unicode support for Kudu table names
[ https://issues.apache.org/jira/browse/IMPALA-6653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon moved KUDU-2340 to IMPALA-6653: --- Workflow: jira (was: Kudu Workflow) Key: IMPALA-6653 (was: KUDU-2340) Project: IMPALA (was: Kudu) > Unicode support for Kudu table names > > > Key: IMPALA-6653 > URL: https://issues.apache.org/jira/browse/IMPALA-6653 > Project: IMPALA > Issue Type: Bug >Reporter: Jim Halfpenny >Priority: Major > > It is possible to create a Kudu table containing unicode characters in its in > Impala by specifying the kudu.table_name attribute. When trying to select > from this table you receive an error that the underlying table does not exist. > The example below shows a table being created successfully, but failing on a > select * statement. > {{[jh-kafka-2:21000] > create table test2( a int primary key) stored as kudu > TBLPROPERTIES('kudu.table_name' = 'impala::kudutest.');}} > {{Query: create table test2( a int primary key) stored as kudu > TBLPROPERTIES('kudu.table_name' = 'impala::kudutest.')}} > {{WARNINGS: Unpartitioned Kudu tables are inefficient for large data > sizes.}}{{Fetched 0 row(s) in 0.64s}} > {{[jh-kafka-2:21000] > select * from test2;}} > {{Query: select * from test2}} > {{Query submitted at: 2018-03-13 08:23:29 (Coordinator: > https://jh-kafka-2:25000)}} > {{ERROR: AnalysisException: Failed to load metadata for table: 'test2'}} > {{CAUSED BY: TableLoadingException: Error loading metadata for Kudu table > impala::kudutest.}} > {{CAUSED BY: ImpalaRuntimeException: Error opening Kudu table > 'impala::kudutest.', Kudu error: The table does not exist: table_name: > "impala::kudutest."}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-4238) custom_cluster/test_client_ssl.py TestClientSsl.test_ssl AssertionError: SIGINT was not caught by shell within 30s
[ https://issues.apache.org/jira/browse/IMPALA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sailesh Mukil resolved IMPALA-4238. --- Resolution: Duplicate > custom_cluster/test_client_ssl.py TestClientSsl.test_ssl AssertionError: > SIGINT was not caught by shell within 30s > -- > > Key: IMPALA-4238 > URL: https://issues.apache.org/jira/browse/IMPALA-4238 > Project: IMPALA > Issue Type: Bug > Components: Security >Affects Versions: Impala 2.8.0, Impala 2.10.0 >Reporter: Harrison Sheinblatt >Assignee: Sailesh Mukil >Priority: Major > Labels: flaky > > asf master core test failure: > http://sandbox.jenkins.sf.cloudera.com/view/Impala/view/Evergreen-asf-master/job/impala-asf-master-core/540/ > http://sandbox.jenkins.sf.cloudera.com/job/impala-umbrella-build-and-test/4921/console > {noformat} > 08:18:51 === FAILURES > === > 08:18:51 TestClientSsl.test_ssl[exec_option: {'disable_codegen': False, > 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0, 'batch_size': 0, > 'num_nodes': 0} | table_format: text/none] > 08:18:51 > 08:18:51 self = > 08:18:51 vector = > 08:18:51 > 08:18:51 @pytest.mark.execute_serially > 08:18:51 > @CustomClusterTestSuite.with_args("--ssl_server_certificate=%s/server-cert.pem > " > 08:18:51 > "--ssl_private_key=%s/server-key.pem" > 08:18:51 % (CERT_DIR, CERT_DIR)) > 08:18:51 def test_ssl(self, vector): > 08:18:51 > 08:18:51 self._verify_negative_cases() > 08:18:51 # TODO: This is really two different tests, but the custom > cluster takes too long to > 08:18:51 # start. Make it so that custom clusters can be specified > across test suites. > 08:18:51 self._validate_positive_cases("%s/server-cert.pem" % > self.CERT_DIR) > 08:18:51 > 08:18:51 # No certificate checking: will accept any cert. > 08:18:51 self._validate_positive_cases() > 08:18:51 > 08:18:51 # Test cancelling a query > 08:18:51 impalad = ImpaladService(socket.getfqdn()) > 08:18:51 impalad.wait_for_num_in_flight_queries(0) > 08:18:51 p = ImpalaShell(args="--ssl") > 08:18:51 p.send_cmd("SET DEBUG_ACTION=0:OPEN:WAIT") > 08:18:51 p.send_cmd("select count(*) from functional.alltypes") > 08:18:51 impalad.wait_for_num_in_flight_queries(1) > 08:18:51 > 08:18:51 LOG = logging.getLogger('test_client_ssl') > 08:18:51 LOG.info("Cancelling query") > 08:18:51 num_tries = 0 > 08:18:51 # In practice, sending SIGINT to the shell process doesn't > always seem to get caught > 08:18:51 # (and a search shows up some bugs in Python where SIGINT > might be ignored). So retry > 08:18:51 # for 30s until one signal takes. > 08:18:51 while impalad.get_num_in_flight_queries() == 1: > 08:18:51 time.sleep(1) > 08:18:51 LOG.info("Sending signal...") > 08:18:51 os.kill(p.pid(), signal.SIGINT) > 08:18:51 num_tries += 1 > 08:18:51 > assert num_tries < 30, "SIGINT was not caught by shell > within 30s" > 08:18:51 E AssertionError: SIGINT was not caught by shell within 30s > 08:18:51 E assert 30 < 30 > 08:18:51 > 08:18:51 custom_cluster/test_client_ssl.py:85: AssertionError > 08:18:51 Captured stdout setup > - > 08:18:51 Starting State Store logging to > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/statestored.INFO > 08:18:51 Starting Catalog Service logging to > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/catalogd.INFO > 08:18:51 Starting Impala Daemon logging to > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad.INFO > 08:18:51 Starting Impala Daemon logging to > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node1.INFO > 08:18:51 Starting Impala Daemon logging to > /data/jenkins/workspace/impala-umbrella-build-and-test/repos/Impala/logs/custom_cluster_tests/impalad_node2.INFO > 08:18:51 Waiting for Catalog... Status: 53 DBs / 1077 tables (ready=True) > 08:18:51 Waiting for Catalog... Status: 53 DBs / 1077 tables (ready=True) > 08:18:51 Waiting for Catalog... Status: 53 DBs / 1077 tables (ready=True) > 08:18:51 Impala Cluster Running with 3 nodes. > 08:18:51 Captured stderr setup > - > 08:18:51 MainThread: Found 3 impalad/1 statestored/1 catalogd process(es) > 08:18:51 MainThread:
[jira] [Resolved] (IMPALA-6638) File handle cache shows contention when cold
[ https://issues.apache.org/jira/browse/IMPALA-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joe McDonnell resolved IMPALA-6638. --- Resolution: Fixed Fix Version/s: Impala 2.12.0 > File handle cache shows contention when cold > > > Key: IMPALA-6638 > URL: https://issues.apache.org/jira/browse/IMPALA-6638 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: Joe McDonnell >Assignee: Joe McDonnell >Priority: Major > Fix For: Impala 2.12.0 > > > Performance tests show that when the file handle cache is cold, there is > contention on the file handle cache partition lock. This added contention is > particularly severe when there are multiple IO threads accessing the same > file (e.g. when there is a query accessing multiple Parquet columns). This is > because the IO threads all map to the same partition because they are > accessing the same file. > The contention is due to the fact that FileHandleCache::GetFileHandle() holds > the lock while it opens the file handle. This lengthens the critical section > considerably, because opening a file handle involves network traffic to the > NameNode. This contention does not exist when the cache is hot. > FileHandleCache::GetFileHandle() should drop the lock while it is opening the > file handle. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6655) Set owner information on database creation
Fredy Wijaya created IMPALA-6655: Summary: Set owner information on database creation Key: IMPALA-6655 URL: https://issues.apache.org/jira/browse/IMPALA-6655 Project: IMPALA Issue Type: Improvement Components: Frontend Reporter: Fredy Wijaya Assignee: Fredy Wijaya Currently Impala only shows owner information when using DESCRIBE DATABASE EXTENDED for databases created outside Impala. When a database is created inside Impala, the owner information is never set. For table creation, Impala always sets the owner information which can be shown by using DESCRIBE EXTENDED . To make the behavior consistent, we need to set the owner information on database creation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6654) [DOCS] Kudu/Sentry docs are out of date
Thomas Tauber-Marshall created IMPALA-6654: -- Summary: [DOCS] Kudu/Sentry docs are out of date Key: IMPALA-6654 URL: https://issues.apache.org/jira/browse/IMPALA-6654 Project: IMPALA Issue Type: Bug Components: Docs Affects Versions: Impala 2.11.0 Reporter: Thomas Tauber-Marshall The documentation of Impala's support for Sentry authorization on Kudu tables, available here: http://impala.apache.org/docs/build/html/topics/impala_kudu.html is out of date. It should be updated to include the changes made in IMPALA-5489. In particular: - Access is no longer "all or nothing" - we support column-level permissions - Permissions do not apply "to all SQL operations" - we support SELECT- and INSERT-specific permissions. DELETE/UPDATE/UPSERT still require ALL We should also document that "all on server" is required to specify "kudu.master_addresses" in a CREATE, even for managed tables, in addition to be required to CREATE any external table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6449) Use CLOCK_MONOTONIC in ConditonVariable
[ https://issues.apache.org/jira/browse/IMPALA-6449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-6449. Resolution: Fixed Fix Version/s: Impala 2.12.0 Impala 3.0 https://github.com/apache/impala/commit/30c0375ed358f8040d28fe756a17c6e3965177b1 IMPALA-6449: Use CLOCK_MONOTONIC in ConditionVariable ConditionVariable is a thin wrapper around pthread_cond_*. Currently, pthread_cond_timedwait() uses the default attribute CLOCK_REALTIME. This is susceptible to adjustment to the system clock from various sources such as NTP and time may go backward. This change fixes the problem by switching to using CLOCK_MONOTONIC so time will be monotonic although the frequency of the clock ticks may still be adjusted by NTP. Ideally, we should use CLOCK_MONOTONIC_RAW but it's available only on Linux kernel 2.6.28 or latter. This change also get rids of some usage of boost::get_system_time() which suffers from the same problem. Change-Id: I81611cfd5e7c5347203fe7fa6b0f615602257f87 Reviewed-on: http://gerrit.cloudera.org:8080/9158 Reviewed-by: Michael HoTested-by: Impala Public Jenkins > Use CLOCK_MONOTONIC in ConditonVariable > --- > > Key: IMPALA-6449 > URL: https://issues.apache.org/jira/browse/IMPALA-6449 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0, > Impala 2.11.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Minor > Fix For: Impala 3.0, Impala 2.12.0 > > > There are various places in the code which call > {{ConditionVariable::WaitUntil()}} or {{ConditionVariable::WaitFor()}} with a > time computed from {{boost::get_system_time()}}. > {noformat} > template > bool WaitFor(boost::unique_lock& lock, > const duration_type& wait_duration) { > return WaitUntil(lock, to_timespec(boost::get_system_time() + > wait_duration)); > } > {noformat} > blocking-queue.h: > {noformat} > template > bool BlockingPutWithTimeout(V&& val, int64_t timeout_micros) { > MonotonicStopWatch timer; > boost::unique_lock write_lock(put_lock_); > boost::system_time wtime = boost::get_system_time() + > boost::posix_time::microseconds(timeout_micros); > {noformat} > thrift-server.cc: > {noformat} > system_time deadline = get_system_time() + > > posix_time::milliseconds(ThriftServer::ThriftServerEventProcessor::TIMEOUT_MS); > // Loop protects against spurious wakeup. Locks provide necessary fences to > ensure > // visibility. > while (!signal_fired_) { > // Yields lock and allows supervision thread to continue and signal > if (!signal_cond_.WaitUntil(lock, deadline)) { > {noformat} > The above are susceptible to clock adjustment from various sources such as > NTP. We should switch to using {{clock_gettime(CLOCK_MONOTONIC, ...)}} for > such elapsed time measurement. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IMPALA-6624) Network error: failed to write to TLS socket: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c
[ https://issues.apache.org/jira/browse/IMPALA-6624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Ho resolved IMPALA-6624. Resolution: Fixed Fix Version/s: Impala 2.12.0 Impala 3.0 Fixed at [https://github.com/apache/impala/commit/8079cd9d2a87051f81a41910b74fab15e35f36ea] KUDU-2334: Fix OutboundTransfer::TransferStarted() to work with SSL_write() Previously, OutboundTransfer::TransferStarted() returns true iff non-zero bytes have been successfully sent via Writev(). As it turns out, this doesn't work well with SSL_write(). When SSL_write() returns -1 with errno EAGAIN or ETRYAGAIN, we need to retry the call with exactly the same buffer pointer next time even if 0 bytes have been written. The following sequence becomes problematic with the previous implementation of OutboundTransfer::TransferStarted(): - WriteHandler() calls SendBuffer() on an OutboundTransfer. - SendBuffer() calls TlsSocket::Writev() which hits the EAGAIN error above. Since 0 bytes were written, cur_slice_idx_ and cur_offset_in_slice_ remain 0 and OutboundTransfer::TransferStarted() still returns false. - OutboundTransfer is cancelled or timed out. car->call is set to NULL. - WirteHandler() is called again and as it notices that the OutboundTransfer hasn't really started yet and "car->call" is NULL due to cancellation, it removes it from the outbound transfer queue and moves on to the next entry in the queue. - WriteHandler() calls SendBuffer() with the next entry in the queue and eventually calls SSL_write() with a different buffer than expected by SSL_write(), leading to "SSL3_WRITE_PENDING:bad write retry" error. This change fixes the problem above by adding a boolean flag 'started_' which is set to true if OutboundTransfer::SendBuffer() has been called at least once. Also added some tests to exercise cancellation paths with multiple concurrent RPCs. Confirmed the problem above is fixed by running stress test in a 130 node cluster with Impala. The problem happened consistently without the fix. Change-Id: Id7ebdcbc1ef2a3e0c5e7162f03214c232755b683 Reviewed-on: http://gerrit.cloudera.org:8080/9587 Reviewed-by: Sailesh MukilReviewed-by: Todd Lipcon Tested-by: Todd Lipcon Reviewed-on: http://gerrit.cloudera.org:8080/9606 Tested-by: Impala Public Jenkins > Network error: failed to write to TLS socket: error:1409F07F:SSL > routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c > - > > Key: IMPALA-6624 > URL: https://issues.apache.org/jira/browse/IMPALA-6624 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec >Affects Versions: Impala 3.0, Impala 2.12.0 >Reporter: Michael Ho >Assignee: Michael Ho >Priority: Blocker > Fix For: Impala 3.0, Impala 2.12.0 > > > During stress testing in a secure 140 node cluster, Impalad ran into the > following errors. This is supposed to be fixed in KUDU-2218. The fix for > KUDU-2218 has already been cherry-picked to Impala code base at this > [commit|https://github.com/apache/impala/commit/678bf28e233e667b05585110422762614840bdc2] > and the build should have this commit. It's unclear if Impala may be missing > other commits or the issue in KUDU-2218 is not completely fixed. > Assigning to [~sailesh] to lead the investigation. Please feel free to > reassign to me if you are swamped Sailesh. > {noformat} > W0307 03:31:04.512100 158268 connection.cc:659] client connection to > 10.17.221.47:27000 send error: Network error: failed to write to TLS socket: > error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c:874 > W0307 03:31:04.524086 158268 connection.cc:153] Shutting down client > connection to 10.17.221.47:27000 with pending inbound data (11/16 bytes > received, last active 0 ns ago, status=Network error: failed to write to TLS > socket: error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad writ > e retry:s3_pkt.c:874) > E0307 03:31:04.535635 123156 krpc-data-stream-sender.cc:335] channel send to > 10.17.221.47:27000 failed: TransmitData() to 10.17.221.47:27000 failed: > Network error: failed to write to TLS socket: error:1409F07F:SSL > routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c:874 > W0307 03:31:04.536145 158268 connection.cc:190] Error closing socket: Network > error: TlsSocket::Close: Success > E0307 03:31:04.584370 140087 krpc-data-stream-sender.cc:335] channel send to > 10.17.221.47:27000 failed: TransmitData() to 10.17.221.47:27000 failed: > Network error: failed to write to TLS socket: error:1409F07F:SSL > routines:SSL3_WRITE_PENDING:bad write retry:s3_pkt.c:874 > I0307 03:31:04.697773 158412 rpcz_store.cc:255] Call >
[jira] [Created] (IMPALA-6656) Metrics for time spent in BufferAllocator
Tim Armstrong created IMPALA-6656: - Summary: Metrics for time spent in BufferAllocator Key: IMPALA-6656 URL: https://issues.apache.org/jira/browse/IMPALA-6656 Project: IMPALA Issue Type: Improvement Components: Backend Reporter: Tim Armstrong Assignee: Tim Armstrong We should track the total time spent and the time spent in TCMalloc so we can understand where time is going globally. I think we should shard these metrics across the arenas so we can see if the problem is just per-arena, and also to avoid contention between threads when updating the metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-6657) Investigate why memory allocation in Exchange receiver node takes a long time
Mostafa Mokhtar created IMPALA-6657: --- Summary: Investigate why memory allocation in Exchange receiver node takes a long time Key: IMPALA-6657 URL: https://issues.apache.org/jira/browse/IMPALA-6657 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.11.0 Reporter: Mostafa Mokhtar Assignee: Mostafa Mokhtar Attachments: Impala query profile.txt It was observed while inserting large amounts of data into a Kudu table Exchange operator was running slow, query profile showed a big portion of the time was spent in memory allocation in the buffer pool {code} EXCHANGE_NODE (id=1):(Total: 5h53m, non-child: 48s853ms, % non-child: 0.23%) - ConvertRowBatchTime: 20s289ms - PeakMemoryUsage: 19.53 MB (20483562) - RowsReturned: 575.30M (575298780) - RowsReturnedRate: 27.10 K/sec Buffer pool: - AllocTime: 2h53m - CumulativeAllocationBytes: 261.26 GB (280526643200) - CumulativeAllocations: 13.70M (13697590) - PeakReservation: 18.06 MB (18939904) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 18.06 MB (18939904) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns RecvrSide: BytesReceived(8m32s): 20.91 GB, 37.03 GB, 45.62 GB, 53.22 GB, 60.17 GB, 66.30 GB, 71.60 GB, 76.59 GB, 81.36 GB, 86.03 GB, 90.35 GB, 94.30 GB, 98.17 GB, 101.98 GB, 105.58 GB, 109.08 GB, 112.33 GB, 115.47 GB, 118.45 GB, 121.30 GB, 124.09 GB, 126.74 GB, 129.26 GB, 131.88 GB, 134.41 GB, 136.85 GB, 139.32 GB, 141.77 GB, 144.23 GB, 146.71 GB, 148.26 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.29 GB, 148.30 GB, 148.30 GB, 148.30 GB, 148.30 GB, 148.30 GB, 148.30 GB, 148.30 GB, 148.30 GB - FirstBatchArrivalWaitTime: 1s071ms - TotalBytesReceived: 148.30 GB (159234237617) - TotalGetBatchTime: 5h53m - DataArrivalTimer: 5h52m SenderSide: - DeserializeRowBatchTime: 3h4m - NumBatchesArrived: 6.85M (6848795) - NumBatchesDeferred: 99.67K (99667) - NumBatchesEnqueued: 6.85M (6848795) - NumBatchesReceived: 6.85M (6848795) - NumEarlySenders: 0 (0) - NumEosReceived: 0 (0) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)