[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-22 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1145589892


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -769,12 +784,48 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
 .withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {
+byte[] rowKey = CellUtil.cloneRow(cell);
+HRegionLocation hRegionLocation = 
FutureUtils.get(loc.getRegionLocation(rowKey));
+InetSocketAddress[] favoredNodes = null;
+if (null == hRegionLocation) {
+  LOG.warn("Failed get of location, use default writer {}", 
Bytes.toString(rowKey));

Review Comment:
   ok,i will update this



##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -769,12 +784,48 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
 .withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {
+byte[] rowKey = CellUtil.cloneRow(cell);
+HRegionLocation hRegionLocation = 
FutureUtils.get(loc.getRegionLocation(rowKey));
+InetSocketAddress[] favoredNodes = null;
+if (null == hRegionLocation) {
+  LOG.warn("Failed get of location, use default writer {}", 
Bytes.toString(rowKey));
+  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
+
.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+} else {
+  LOG.debug("First rowkey: [{}]", Bytes.toString(rowKey));
+  InetSocketAddress initialIsa =
+new InetSocketAddress(hRegionLocation.getHostname(), 
hRegionLocation.getPort());
+  if (initialIsa.isUnresolved()) {
+LOG.warn("Failed resolve address {}, use default writer",

Review Comment:
   ok,i will update this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-20 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1142842551


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -768,13 +782,50 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 .withChecksumType(StoreUtils.getChecksumType(conf))
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
-.withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+.build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {
+byte[] rowKey = CellUtil.cloneRow(cell);
+HRegionLocation hRegionLocation = 
FutureUtils.get(loc.getRegionLocation(rowKey));
+InetSocketAddress[] favoredNodes = null;
+if (null == hRegionLocation) {
+  LOG.trace("Failed get of location, use default writer {}", 
Bytes.toString(rowKey));
+} else {
+  LOG.debug("First rowkey: [{}]", Bytes.toString(rowKey));
+  InetSocketAddress initialIsa =
+new InetSocketAddress(hRegionLocation.getHostname(), 
hRegionLocation.getPort());
+  if (initialIsa.isUnresolved()) {
+LOG.trace("Failed resolve address {}, use default writer",
+  hRegionLocation.getHostnamePort());
+  } else {
+LOG.debug("Use favored nodes writer: {}", 
initialIsa.getHostString());
+favoredNodes = new InetSocketAddress[] { initialIsa };
+  }
+}
+if (null == favoredNodes) {
+  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
+
.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+} else {
+  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
+.withBloomType(bloomFilterType).withFileContext(hFileContext)
+.withFavoredNodes(favoredNodes).build();
+}

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-20 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1142830952


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -768,13 +782,50 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 .withChecksumType(StoreUtils.getChecksumType(conf))
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
-.withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+.build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {

Review Comment:
   Sorry, I don't understand what you mean by waste, I think this code will 
only be executed once when the halfwriter is first initialized, not once for 
each cell



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-20 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1142826895


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -768,13 +782,50 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 .withChecksumType(StoreUtils.getChecksumType(conf))
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
-.withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+.build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {
+byte[] rowKey = CellUtil.cloneRow(cell);
+HRegionLocation hRegionLocation = 
FutureUtils.get(loc.getRegionLocation(rowKey));
+InetSocketAddress[] favoredNodes = null;
+if (null == hRegionLocation) {
+  LOG.trace("Failed get of location, use default writer {}", 
Bytes.toString(rowKey));
+} else {
+  LOG.debug("First rowkey: [{}]", Bytes.toString(rowKey));
+  InetSocketAddress initialIsa =
+new InetSocketAddress(hRegionLocation.getHostname(), 
hRegionLocation.getPort());
+  if (initialIsa.isUnresolved()) {
+LOG.trace("Failed resolve address {}, use default writer",

Review Comment:
   ok,i got it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-20 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1142826021


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -768,13 +782,50 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 .withChecksumType(StoreUtils.getChecksumType(conf))
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
-.withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+.build();
+
   HFileScanner scanner = halfReader.getScanner(false, false, false);
   scanner.seekTo();
   do {
-halfWriter.append(scanner.getCell());
+final Cell cell = scanner.getCell();
+if (null != halfWriter) {
+  halfWriter.append(cell);
+} else {
+
+  // init halfwriter
+  if (conf.getBoolean(LOCALITY_SENSITIVE_CONF_KEY, 
DEFAULT_LOCALITY_SENSITIVE)) {
+byte[] rowKey = CellUtil.cloneRow(cell);
+HRegionLocation hRegionLocation = 
FutureUtils.get(loc.getRegionLocation(rowKey));
+InetSocketAddress[] favoredNodes = null;
+if (null == hRegionLocation) {
+  LOG.trace("Failed get of location, use default writer {}", 
Bytes.toString(rowKey));

Review Comment:
   yes,default writer that doesn't take region location into account,i will 
raise the log level to  warn



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hbase] haohao0103 commented on a diff in pull request #5121: HBASE-27733

2023-03-20 Thread via GitHub


haohao0103 commented on code in PR #5121:
URL: https://github.com/apache/hbase/pull/5121#discussion_r1142824027


##
hbase-server/src/main/java/org/apache/hadoop/hbase/tool/BulkLoadHFilesTool.java:
##
@@ -768,13 +782,50 @@ private static void copyHFileHalf(Configuration conf, 
Path inFile, Path outFile,
 .withChecksumType(StoreUtils.getChecksumType(conf))
 
.withBytesPerCheckSum(StoreUtils.getBytesPerChecksum(conf)).withBlockSize(blocksize)
 
.withDataBlockEncoding(familyDescriptor.getDataBlockEncoding()).withIncludesTags(true)
-.withCreateTime(EnvironmentEdgeManager.currentTime()).build();
-  halfWriter = new StoreFileWriter.Builder(conf, cacheConf, 
fs).withFilePath(outFile)
-.withBloomType(bloomFilterType).withFileContext(hFileContext).build();
+.build();

Review Comment:
   It was my mistake and I will correct it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org