[
https://issues.apache.org/jira/browse/PHOENIX-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Samarth Jain updated PHOENIX-1596:
----------------------------------
Attachment: PHOENIX-1596.patch
Patch that fixes the stale thread local value for current span. The span set in
thread local by calling Tracing.childOnServer() wasn't getting reset back to
the old value of current span. As a result, every request executing on that
thread was seeing tracing to be on. More the number of queries, more were such
threads with stale thread local values, eventually leading to infinite loops
and region server crash.
Also, in Indexer.java calling currentSpan.stop() is not needed since closing
the trace scope itself stops the current span. The white space changes in that
file are because the indentation and formatting was off.
[~jamestaylor][~jesse_yates] please review.
> Setting tracing frequency to ALWAYS on the client side results in too many
> traces
> ---------------------------------------------------------------------------------
>
> Key: PHOENIX-1596
> URL: https://issues.apache.org/jira/browse/PHOENIX-1596
> Project: Phoenix
> Issue Type: Bug
> Reporter: Samarth Jain
> Attachments: PHOENIX-1596.patch
>
>
> After setting trace collection frequency to always by setting the following
> in hbase-site.xml, I noticed that it created way too many traces in the trace
> table.
> <property>
> <name>phoenix.trace.frequency</name>
> <value>always</value>
> </property>
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 1283 |
> +------------------------------------------+
> 1 row selected (1.104 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 4051 |
> +------------------------------------------+
> 1 row selected (1.058 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 10668 |
> +------------------------------------------+
> 1 row selected (1.105 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 11361 |
> +------------------------------------------+
> 1 row selected (1.046 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 193119 |
> +------------------------------------------+
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 1283 |
> +------------------------------------------+
> 1 row selected (1.104 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 4051 |
> +------------------------------------------+
> 1 row selected (1.058 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 10668 |
> +------------------------------------------+
> 1 row selected (1.105 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 11361 |
> +------------------------------------------+
> 1 row selected (1.046 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> +------------------------------------------+
> | COUNT(1) |
> +------------------------------------------+
> | 193119 |
> +------------------------------------------+
> 1 row selected (6.737 seconds)
> 0: jdbc:phoenix:localhost> select count (*) from system.tracing_stats;
> 15/01/19 17:26:57 WARN client.HConnectionManager$HConnectionImplementation:
> This client just lost it's session with ZooKeeper, closing it. It will be
> recreated next time someone needs it
> Even though the only query that was being executed was the select count(*) to
> get the number of rows in the trace table, it ended up creating way too many
> traces than I had expected.
> On my mac, it in fact ended up killing the local hbase cluster altogether!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)