[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-13 Thread Yeonseop Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822539#comment-15822539
 ] 

Yeonseop Kim commented on PHOENIX-3553:
---

Thank you !

> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>Assignee: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-04 Thread Yeonseop Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798257#comment-15798257
 ] 

Yeonseop Kim commented on PHOENIX-3553:
---

One possible scenario is "all files in Store selection" case. In this case, 
HBase treats minor compaction as a major compaction.


> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-04 Thread Yeonseop Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798247#comment-15798247
 ] 

Yeonseop Kim commented on PHOENIX-3553:
---

ScanType.COMPACT_DROP_DELETES is determined in DefaultCompactor and 
CompactionRequest class
{code:title=DefaultCompactor.java|borderStyle=solid}
  public List compact(final CompactionRequest request,
  CompactionThroughputController throughputController, User user) throws 
IOException {
...
ScanType scanType =
request.isAllFiles() ? ScanType.COMPACT_DROP_DELETES : 
ScanType.COMPACT_RETAIN_DELETES;
...
{code}
{code:title=CompactionRequest.java|borderStyle=solid}
  public boolean isAllFiles() {
return this.isMajor == DisplayCompactionType.MAJOR
|| this.isMajor == DisplayCompactionType.ALL_FILES;
  }

  public boolean isMajor() {
return this.isMajor == DisplayCompactionType.MAJOR;
  }
{code}

Following is real HBase log (v1.1.8) from my site.
You can find that during compaction, 2 zookeeper connections are established, 
but only one connection is closed.
(sessionid=0x259160941a6fa5d,0x259160941a6fa5e)

Last line of log tell us that this compaction is 
DisplayCompactionType.ALL_FILES type compaction, not 
DisplayCompactionType.MAJOR. 
There should be a word "major" appeared in the last line for the compaction 
type to be DisplayCompactionType.MAJOR.

{noformat}
2016-12-23 16:48:31,112 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607] 
regionserver.HStore: Starting compaction of 6 file(s) in L#0 of 
TEST.TEST_TABLE,\x0DTCCLN78\x00K4571038A\x00MA_RES_RMV_INTRA_CHUCK\x0009\x00TCCLN78,1482479246551.044fb02eb5f873b692730b2dcbe76e13.
 into 
tmpdir=hdfs://ichbig/hbase/data/default/TEST.TEST_TABLE/044fb02eb5f873b692730b2dcbe76e13/.tmp,
 totalSize=25.8 M
2016-12-23 16:48:31,195 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607] 
zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7a261ee 
connecting to ZooKeeper 
ensemble=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181
2016-12-23 16:48:31,195 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607] 
zookeeper.ZooKeeper: Initiating client connection, 
connectString=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181
 sessionTimeout=9 watcher=hconnection-0x7a261ee0x0, 
quorum=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181,
 baseZNode=/hbase
2016-12-23 16:48:31,197 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Opening socket connection to server 
ichbig-01-005/172.16.252.25:2181. Will not attempt to authenticate using SASL 
(unknown error)
2016-12-23 16:48:31,197 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Socket connection established to 
ichbig-01-005/172.16.252.25:2181, initiating session
2016-12-23 16:48:31,204 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Session establishment complete on server 
ichbig-01-005/172.16.252.25:2181, sessionid = 0x259160941a6fa5d, negotiated 
timeout = 9
2016-12-23 16:48:31,207 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607] 
zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7cdbb8b2 
connecting to ZooKeeper 
ensemble=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181
2016-12-23 16:48:31,207 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607] 
zookeeper.ZooKeeper: Initiating client connection, 
connectString=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181
 sessionTimeout=9 watcher=hconnection-0x7cdbb8b20x0, 
quorum=ichbig-01-004:2181,ichbig-01-005:2181,ichbig-02-005:2181,ichbig-01-006:2181,ichbig-02-006:2181,
 baseZNode=/hbase
2016-12-23 16:48:31,208 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Opening socket connection to server 
ichbig-01-005/172.16.252.25:2181. Will not attempt to authenticate using SASL 
(unknown error)
2016-12-23 16:48:31,208 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Socket connection established to 
ichbig-01-005/172.16.252.25:2181, initiating session
2016-12-23 16:48:31,215 INFO  
[regionserver/myhost/172.16.252.78:16020-shortCompactions-1482479243607-SendThread(ichbig-01-005:2181)]
 zookeeper.ClientCnxn: Session establishment complete on server 
ichbig-01-005/172.16.252.25:2181, sessionid = 

[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-04 Thread Yeonseop Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15797646#comment-15797646
 ] 

Yeonseop Kim commented on PHOENIX-3553:
---

Scheduled major compaction is disabled(hbase.hregion.majorcompaction = 0).
One possible scenario is "all files in Store selection" case. In this case, 
HBase treats minor compaction as a major compaction.

> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-03 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796840#comment-15796840
 ] 

James Taylor commented on PHOENIX-3553:
---

The patch looks fine, but I'm confused as to why 
DefaultStatisticsCollector.initGuidePostDepth() would be called during a minor 
compaction given this check in UngroupedAggregateRegionObserver:
{code}
@Override
public InternalScanner preCompact(final 
ObserverContext c, final Store store,
final InternalScanner scanner, final ScanType scanType) throws 
IOException {
// Compaction and split upcalls run with the effective user context of 
the requesting user.
// This will lead to failure of cross cluster RPC if the effective user 
is not
// the login user. Switch to the login user context to ensure we have 
the expected
// security context.
return User.runAsLoginUser(new 
PrivilegedExceptionAction() {
@Override
public InternalScanner run() throws Exception {
TableName table = 
c.getEnvironment().getRegion().getRegionInfo().getTable();
InternalScanner internalScanner = scanner;
if (scanType.equals(ScanType.COMPACT_DROP_DELETES)) {
try {
long clientTimeStamp = 
TimeKeeper.SYSTEM.getCurrentTime();
StatisticsCollector stats = 
StatisticsCollectorFactory.createStatisticsCollector(
c.getEnvironment(), table.getNameAsString(), 
clientTimeStamp,
store.getFamily().getName());
internalScanner = 
stats.createCompactionScanner(c.getEnvironment(), store, scanner);
{code}

> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2017-01-02 Thread Yeonseop Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15794129#comment-15794129
 ] 

Yeonseop Kim commented on PHOENIX-3553:
---

I think there is no need of test code, because the patch is trivial. I only 
added try-finally wrapping and htable.close().
RegionCoproceesorEnvironment.getTable calls 
CoprocessorHConnection.getConnectionForEnvironment which returns 
ClusterConnection object. 
ClusterConnection is an unmanaged connection which we must clean up externally, 
as mentioned in hbase source ( see line 54 in 
https://github.com/apache/hbase/blob/branch-1.1/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java)

> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-3553) Zookeeper connection should be closed immediately after DefaultStatisticsCollector's collecting stats done

2016-12-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15787049#comment-15787049
 ] 

Hadoop QA commented on PHOENIX-3553:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12845150/PHOENIX-3553.patch
  against master branch at commit 07f92732f9c6d2d9464012cebeb4cefc10da95d5.
  ATTACHMENT ID: 12845150

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
42 warning messages.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+
SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
 env.getConfiguration()));
+get.addColumn(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, 
PhoenixDatabaseMetaData.GUIDE_POSTS_WIDTH_BYTES);
+guidepostWidth = 
PLong.INSTANCE.getCodec().decodeLong(cell.getValueArray(), 
cell.getValueOffset(), SortOrder.getDefault());

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//testReport/
Javadoc warnings: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-PHOENIX-Build/712//console

This message is automatically generated.

> Zookeeper connection should be closed immediately after 
> DefaultStatisticsCollector's collecting stats done
> --
>
> Key: PHOENIX-3553
> URL: https://issues.apache.org/jira/browse/PHOENIX-3553
> Project: Phoenix
>  Issue Type: Bug
>Affects Versions: 4.9.0
>Reporter: Yeonseop Kim
>  Labels: stats, zookeeper
> Fix For: 4.10.0
>
> Attachments: PHOENIX-3553.patch
>
>
> In every minor compaction job of HBase,
> org.apache.phoenix.schema.stats.DefaultStatisticsCollector.initGuidePostDepth()
>  is called,
> and SYSTEM.CATALOG table is open to get guidepost width via
> htable = env.getTable(
>  
> SchemaUtil.getPhysicalTableName(PhoenixDatabaseMetaData.SYSTEM_CATALOG_NAME_BYTES,
>  env.getConfiguration()));
> This function call creates one zookeeper connection to get cluster id.
> DefaultStatisticsCollector doesn't close this zookeeper connection 
> immediately after get guidepost width, and the zookeeper connection remains 
> alive until HRegion is closed.
> This is not a problem with small number of Regions, but when number of Region 
> is large and upsert operation is frequent, the number of zookeeper connection 
> gradually increases  to hundreds, and the zookeeper server nodes experience  
> short of available TCP/IP ports.
> This zookeeper connection should be closed immediately after get guidepost 
> width.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)