[jira] [Resolved] (HADOOP-18691) Add a CallerContext getter on the Schedulable interface

2023-04-20 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-18691.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Add a CallerContext getter on the Schedulable interface
> ---
>
> Key: HADOOP-18691
> URL: https://issues.apache.org/jira/browse/HADOOP-18691
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Christos Bisias
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> We would like to add a default *{color:#00875a}CallerContext{color}* getter 
> on the *{color:#00875a}Schedulable{color}* interface
> {code:java}
> default public CallerContext getCallerContext() {
>   return null;  
> } {code}
> and then override it on the 
> *{color:#00875a}i{color}{color:#00875a}{*}pc/{*}Server.Call{color}* class
> {code:java}
> @Override
> public CallerContext getCallerContext() {  
>   return this.callerContext;
> } {code}
> to expose the already existing *{color:#00875a}callerContext{color}* field.
>  
> This change will help us access the *{color:#00875a}CallerContext{color}* on 
> an Apache Ozone *{color:#00875a}IdentityProvider{color}* implementation.
> On Ozone side the *{color:#00875a}FairCallQueue{color}* doesn't work with the 
> Ozone S3G, because all users are masked under a special S3G user and there is 
> no impersonation. Therefore, the FCQ reads only 1 user and becomes 
> ineffective. We can use the *{color:#00875a}CallerContext{color}* field to 
> store the current user and access it on the Ozone 
> {*}{color:#00875a}IdentityProvider{color}{*}.
>  
> This is a presentation with the proposed approach.
> [https://docs.google.com/presentation/d/1iChpCz_qf-LXiPyvotpOGiZ31yEUyxAdU4RhWMKo0c0/edit#slide=id.p]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18693) Upgrade Apache Derby due to CVEs

2023-04-13 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-18693.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Upgrade Apache Derby due to CVEs
> 
>
> Key: HADOOP-18693
> URL: https://issues.apache.org/jira/browse/HADOOP-18693
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> [https://github.com/advisories/GHSA-wr69-g62g-2r9h]
> [https://github.com/advisories/GHSA-42xw-p62x-hwcf]
> [https://github.com/apache/hadoop/pull/5427]
> Only seems to be used in test scope but it would be nice to silence the 
> dependabot warnings by merging the PR. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18699) InvalidProtocolBufferException caused by JDK 11 < 11.0.18 AES-CTR cipher state corruption with AVX-512 bug

2023-04-12 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-18699:
---

 Summary: InvalidProtocolBufferException caused by JDK 11 < 11.0.18 
AES-CTR cipher state corruption with AVX-512 bug
 Key: HADOOP-18699
 URL: https://issues.apache.org/jira/browse/HADOOP-18699
 Project: Hadoop Common
  Issue Type: Bug
  Components: hdfs-client
Reporter: Siyao Meng


This serves as a PSA for a JDK bug. Not really a bug in Hadoop / HDFS itself.

[~relek] identified [JDK-8292158|https://bugs.openjdk.org/browse/JDK-8292158] 
(backported to JDK 11 in 
[JDK-8295297|https://bugs.openjdk.org/browse/JDK-8295297]) causes HDFS clients 
to fail with InvalidProtocolBufferException due to corrupted protobuf message 
in Hadoop RPC request when all of the below conditions are met:

1. The host is capable of AVX-512 instruction sets
2. AVX-512 is enabled in JVM. This should be enabled by default on AVX-512 
capable hosts, equivalent to specifying JVM arg {{-XX:UseAVX=3}}

As a result, the client could print messages like these:

{code:title=Symptoms on the HDFS client}
2023-02-21 15:21:44,380 WARN org.apache.hadoop.hdfs.DFSClient: Connection 
failure: Failed to connect to  for file 
/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2023_02_21-15_21_25.b6788e89894a61b5
 for block 
BP-1836197545-10.125.248.11-1672668423261:blk_1073935111_194857:com.google.protobuf.InvalidProtocolBufferException:
 Protocol message tag had invalid wire type.
com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had 
invalid wire type.

2023-02-21 15:21:44,378 WARN org.apache.hadoop.hdfs.DFSClient: Connection 
failure: Failed to connect to  for file 
/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2023_02_21-15_21_25.b6788e89894a61b5
 for block 
BP-1836197545--1672668423261:blk_1073935111_194857:com.google.protobuf.InvalidProtocolBufferException:
 Protocol message end-group tag did not match expected tag.
com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group 
tag did not match expected tag.

2023-02-21 15:06:55,530 WARN org.apache.hadoop.hdfs.DFSClient: Connection 
failure: Failed to connect to  for file 
/tmp/.cloudera_health_monitoring_canary_files/.canary_file_2023_02_21-15_06_55.b4a633a8bde014aa
 for block 
BP-1836197545--1672668423261:blk_1073935025_194771:com.google.protobuf.InvalidProtocolBufferException:
 While parsing a protocol message, the input ended unexpectedly in the middle 
of a field. This could mean either than the input has been truncated or that an 
embedded message misreported its own length.
com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol 
message, the input ended unexpectedly in the middle of a field. This could mean 
either than the input has been truncated or that an embedded message 
misreported its own length.
{code}

Solutions:
1. As a workaround, append {{-XX:UseAVX=2}} to client JVM args; or
2. Upgrade to OpenJDK >= 11.0.18.


I might post a repro test case for this, or find a way in the code to prompt 
the user that this could be the potential issue (need to upgrade JDK 11) when 
it occurs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18564) Use file-level checksum by default when copying between two different file systems

2022-12-08 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-18564:
---

 Summary: Use file-level checksum by default when copying between 
two different file systems
 Key: HADOOP-18564
 URL: https://issues.apache.org/jira/browse/HADOOP-18564
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Siyao Meng


h2. Goal

Reduce user friction

h2. Background

When distcp'ing between two different file systems, distcp still uses 
block-level checksum by default, even though the two file systems can be very 
different in how they manage blocks, so that a block-level checksum no longer 
makes sense between these two.

e.g. distcp between HDFS and Ozone without overriding 
{{dfs.checksum.combine.mode}} throws IOException because the blocks of the same 
file on two FSes are different (as expected):

{code}
$ hadoop distcp -i -pp /test o3fs://buck-test1.vol1.ozone1/
java.lang.Exception: java.io.IOException: File copy failed: 
hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> 
o3fs://buck-test1.vol1.ozone1/test/test.bin
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: File copy failed: 
hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin --> 
o3fs://buck-test1.vol1.ozone1/test/test.bin
at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:219)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:48)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying 
hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin to 
o3fs://buck-test1.vol1.ozone1/test/test.bin
at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
... 11 more
Caused by: java.io.IOException: Checksum mismatch between 
hdfs://duong-1.duong.root.hwx.site:8020/test/test.bin and 
o3fs://buck-test1.vol1.ozone1/.distcp.tmp.attempt_local1346550241_0001_m_00_0.Source
 and destination filesystems are of different types
Their checksum algorithms may be incompatible You can choose file-level 
checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when 
block-sizes or filesystems are different. Or you can skip checksum-checks 
altogether  with -skipcrccheck.
{code}

And it works when we use a file-level checksum like {{COMPOSITE_CRC}}:

{code:title=With -Ddfs.checksum.combine.mode=COMPOSITE_CRC}
$ hadoop distcp -i -pp /test o3fs://buck-test2.vol1.ozone1/ 
-Ddfs.checksum.combine.mode=COMPOSITE_CRC
22/10/18 19:07:42 INFO mapreduce.Job: Job job_local386071499_0001 completed 
successfully
22/10/18 19:07:42 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=219900
FILE: Number of bytes written=794129
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=0
HDFS: Number of bytes written=0
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
HDFS: Number of bytes read erasure-coded=0
O3FS: Number of bytes read=0
O3FS: Number of bytes written=0
O3FS: Number of read operations=5
O3FS: Number of large read operations=0
O3FS: Number of write operations=0
..
{code}

h2. Alternative

(if changing global defaults could potentially break distcp'ing between 
HDFS/S3/etc.)

Don't touch the global default, and make it a client-side config.

e.g. add a config to allow automatically usage of COMPOSITE_CRC 
(dfs.checksum.combine.mode) when distcp'ing between HDFS and Ozone, which would 
be the equivalent of specifying {{-Ddfs.checksum.combine.mode=COMPOSITE_CRC}} 
on the distcp command but the end user won't have to specify it every single 

[jira] [Resolved] (HADOOP-18101) Bump aliyun-sdk-oss to 3.13.2 and jdom2 to 2.0.6.1

2022-02-03 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-18101.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Bump aliyun-sdk-oss to 3.13.2 and jdom2 to 2.0.6.1
> --
>
> Key: HADOOP-18101
> URL: https://issues.apache.org/jira/browse/HADOOP-18101
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Aswin Shakil Balasubramanian
>Assignee: Aswin Shakil Balasubramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current aliyun-sdk-oss 3.13.0 is affected by 
> [CVE-2021-33813|https://github.com/advisories/GHSA-2363-cqg2-863c] due to 
> jdom 2.0.6. maven-shade-plugin is also affected by the CVE. 
> Bumping aliyun-sdk-oss to 3.13.2 and jdom2 to 2.0.6.1 will resolve this issue
> {code:java}
> [INFO] +- org.apache.maven.plugins:maven-shade-plugin:jar:3.2.1:provided
> [INFO] |  +- 
> org.apache.maven.shared:maven-artifact-transfer:jar:0.10.0:provided
> [INFO] |  +- org.jdom:jdom2:jar:2.0.6:provided
> ..
> [INFO] +- com.aliyun.oss:aliyun-sdk-oss:jar:3.13.1:compile
> [INFO] |  +- org.jdom:jdom2:jar:2.0.6:compile
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18048) [branch-3.3] Dockerfile_aarch64 build fails with fatal error: Python.h: No such file or directory

2021-12-14 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-18048:
---

 Summary: [branch-3.3] Dockerfile_aarch64 build fails with fatal 
error: Python.h: No such file or directory
 Key: HADOOP-18048
 URL: https://issues.apache.org/jira/browse/HADOOP-18048
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Siyao Meng
Assignee: Siyao Meng


See previous discussion: 
https://issues.apache.org/jira/browse/HADOOP-17723?focusedCommentId=17452329=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17452329



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18031) Dockerfile: Support arm64 (aarch64)

2021-12-02 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-18031:
---

 Summary: Dockerfile: Support arm64 (aarch64)
 Key: HADOOP-18031
 URL: https://issues.apache.org/jira/browse/HADOOP-18031
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Siyao Meng
Assignee: Siyao Meng


Support building Linux arm64 (aarch64) Docker images.

And bump up some dependency versions.

The patch for branch-3.3 is ready. I developed this patch on branch-3.3.1 when 
I was trying to build arm64 Linux Hadoop Docker image.

For trunk (3.4.0), due to HADOOP-17509, I need to post a different PR.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17834) Bump aliyun-sdk-oss to 2.0.6

2021-08-03 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-17834:
---

 Summary: Bump aliyun-sdk-oss to 2.0.6
 Key: HADOOP-17834
 URL: https://issues.apache.org/jira/browse/HADOOP-17834
 Project: Hadoop Common
  Issue Type: Task
Reporter: Siyao Meng
Assignee: Siyao Meng


Bump aliyun-sdk-oss to 2.0.6 in order to remove jdom 1.1 dependency.

Ref: 
https://issues.apache.org/jira/browse/HADOOP-17820?focusedCommentId=17390206=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17390206.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17820) Remove dependency on jdom

2021-07-29 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-17820.
-
Resolution: Won't Do

> Remove dependency on jdom
> -
>
> Key: HADOOP-17820
> URL: https://issues.apache.org/jira/browse/HADOOP-17820
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> It doesn't seem that jdom is referenced anywhere in the code base now, yet it 
> exists in the distribution.
> {code}
> $ find . -name "*jdom*.jar"
> ./hadoop-3.4.0-SNAPSHOT/share/hadoop/tools/lib/jdom-1.1.jar
> {code}
> There is recently 
> [CVE-2021-33813|https://github.com/advisories/GHSA-2363-cqg2-863c] issued for 
> jdom. Let's remove the binary from the dist if not useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17820) Remove dependency on jdom

2021-07-29 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-17820:
---

 Summary: Remove dependency on jdom
 Key: HADOOP-17820
 URL: https://issues.apache.org/jira/browse/HADOOP-17820
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Siyao Meng
Assignee: Siyao Meng


It doesn't seem that jdom is referenced anywhere in the code base now, yet it 
exists in the distribution.

{code}
$ find . -name "*jdom*.jar"
./hadoop-3.4.0-SNAPSHOT/share/hadoop/tools/lib/jdom-1.1.jar
{code}

There is recently 
[CVE-2021-33813|https://github.com/advisories/GHSA-2363-cqg2-863c] issued for 
jdom. Let's remove the binary from the dist if not useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17387) Replace HTrace with NoOp tracer

2020-12-10 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-17387.
-
Resolution: Duplicate

> Replace HTrace with NoOp tracer
> ---
>
> Key: HADOOP-17387
> URL: https://issues.apache.org/jira/browse/HADOOP-17387
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>
> HADOOP-17171 raised the concern that the deprecated htrace binaries has a few 
> CVEs in its dependency jackson-databind. Not that HADOOP-15566 may not be 
> merged any time soon. We can replace the existing htrace impl with a noop 
> tracer (dummy).
> This could be realized by reusing some code in HADOOP-15566's PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17424) Replace HTrace with No-Op tracer

2020-12-09 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-17424:
---

 Summary: Replace HTrace with No-Op tracer
 Key: HADOOP-17424
 URL: https://issues.apache.org/jira/browse/HADOOP-17424
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Siyao Meng


Remove HTrace dependency as it is depending on old jackson jars. Use a no-op 
tracer for now to eliminate potential security issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17387) Replace HTrace with dummy implementation

2020-11-18 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-17387:
---

 Summary: Replace HTrace with dummy implementation
 Key: HADOOP-17387
 URL: https://issues.apache.org/jira/browse/HADOOP-17387
 Project: Hadoop Common
  Issue Type: Task
Reporter: Siyao Meng
Assignee: Siyao Meng


HADOOP-17171 raised the concern that the deprecated htrace binaries has a few 
CVEs in its dependency jackson-databind. Not that HADOOP-15566 may not be 
merged any time soon. We can replace the existing htrace impl with a noop 
tracer (dummy).

This could be realized by reusing some code in HADOOP-15566's PR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16082) FsShell ls: Add option -i to print inode id

2020-09-22 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-16082.
-
Resolution: Abandoned

> FsShell ls: Add option -i to print inode id
> ---
>
> Key: HADOOP-16082
> URL: https://issues.apache.org/jira/browse/HADOOP-16082
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
> Attachments: HADOOP-16082.001.patch
>
>
> When debugging the FSImage corruption issue, I often need to know a file's or 
> directory's inode id. At this moment, the only way to do that is to use OIV 
> tool to dump the FSImage and look up the filename, which is very inefficient.
> Here I propose adding option "-i" in FsShell that prints files' or 
> directories' inode id.
> h2. Implementation
> h3. For hdfs:// (HDFS)
> fileId exists in HdfsLocatedFileStatus, which is already returned to 
> hdfs-client. We just need to print it in Ls#processPath().
> h3. For file:// (Local FS)
> h4. Linux
> Use java.nio.
> h4. Windows
> Windows has the concept of "File ID" which is similar to inode id. It is 
> unique in NTFS and ReFS.
> h3. For other FS
> The fileId entry will be "0" in FileStatus if it is not set. We could either 
> ignore or throw an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16935) Backport HADOOP-10848 Cleanup calling of sun.security.krb5.Config to branch-3.2

2020-03-24 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16935:
---

 Summary: Backport HADOOP-10848 Cleanup calling of 
sun.security.krb5.Config to branch-3.2
 Key: HADOOP-16935
 URL: https://issues.apache.org/jira/browse/HADOOP-16935
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Siyao Meng
Assignee: Siyao Meng


Backport HADOOP-10848 to lower branches so applications using hadoop 3.2.x can 
get rid of the annoying message:
{code}
WARNING: Illegal reflective access by 
org.apache.hadoop.security.authentication.util.KerberosUtil 
(file:/path/to/lib/hadoop-auth-3.2.0.jar) to method 
sun.security.krb5.Config.getInstance()
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16902) Add OpenTracing in S3 Cloud Connector

2020-03-03 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16902:
---

 Summary: Add OpenTracing in S3 Cloud Connector
 Key: HADOOP-16902
 URL: https://issues.apache.org/jira/browse/HADOOP-16902
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Siyao Meng






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16891) Upgrade jackson-databind to 2.9.10.3

2020-02-27 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16891:
---

 Summary: Upgrade jackson-databind to 2.9.10.3
 Key: HADOOP-16891
 URL: https://issues.apache.org/jira/browse/HADOOP-16891
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Siyao Meng
Assignee: Siyao Meng


New RCE found in jackson-databind 2.0.0 through 2.9.10.2. Patched in 2.9.10.3. 
Looks critical.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16867) [thirdparty] Add shaded JargerTracer

2020-02-18 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16867:
---

 Summary: [thirdparty] Add shaded JargerTracer
 Key: HADOOP-16867
 URL: https://issues.apache.org/jira/browse/HADOOP-16867
 Project: Hadoop Common
  Issue Type: Task
Reporter: Siyao Meng
Assignee: Siyao Meng


Add artifact {{hadoop-shaded-jaeger}} to {{hadoop-thirdparty}} for OpenTracing 
work in HADOOP-15566.

CC [~weichiu]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16718) Allow disabling Server Name Indication (SNI) for Jetty

2019-11-18 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16718:
---

 Summary: Allow disabling Server Name Indication (SNI) for Jetty
 Key: HADOOP-16718
 URL: https://issues.apache.org/jira/browse/HADOOP-16718
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Siyao Meng


As of now, {{createHttpsChannelConnector()}} enables SNI by default with the 
Jetty:
{code}
private ServerConnector createHttpsChannelConnector(
Server server, HttpConfiguration httpConfig) {
  httpConfig.setSecureScheme(HTTPS_SCHEME);
  httpConfig.addCustomizer(new SecureRequestCustomizer());
  ServerConnector conn = createHttpChannelConnector(server, httpConfig);
{code}

with the default constructor without any parameters automatically setting 
{{sniHostCheck}} to {{true}}:
{code}
public SecureRequestCustomizer()
{
this(true);
}
{code}

Proposal: We should make this configurable and probably default this to false.

Credit: Aravindan Vijayan



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16656) Document FairCallQueue configs in core-default.xml

2019-10-15 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16656:
---

 Summary: Document FairCallQueue configs in core-default.xml
 Key: HADOOP-16656
 URL: https://issues.apache.org/jira/browse/HADOOP-16656
 Project: Hadoop Common
  Issue Type: Task
Reporter: Siyao Meng


So far those callqueue / scheduler / faircallqueue -related configurations are 
only documented in FairCallQueue.md in 3.3.0:
https://aajisaka.github.io/hadoop-document/hadoop-project/hadoop-project-dist/hadoop-common/FairCallQueue.html#Full_List_of_Configurations
(Thanks Akira for uploading this.)

Goal: Document those configs in core-default.xml as well to make it easier for 
users(admins) to find and use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14930) Upgrade Jetty to 9.4 version

2019-10-01 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-14930.
-
Resolution: Duplicate

Closing this one since latest work is being done in HADOOP-16152.

> Upgrade Jetty to 9.4 version
> 
>
> Key: HADOOP-14930
> URL: https://issues.apache.org/jira/browse/HADOOP-14930
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HADOOP-14930.00.patch
>
>
> Currently 9.3.19.v20170502 is used.
> In hbase 2.0+, 9.4.6.v20170531 is used.
> When starting mini dfs cluster in hbase unit tests, we get the following:
> {code}
> java.lang.NoSuchMethodError: 
> org.eclipse.jetty.server.session.SessionHandler.getSessionManager()Lorg/eclipse/jetty/server/SessionManager;
>   at 
> org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:548)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:529)
>   at org.apache.hadoop.http.HttpServer2.(HttpServer2.java:119)
>   at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:415)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:157)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:887)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:723)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:949)
>   at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:928)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1637)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNode(MiniDFSCluster.java:1277)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.configureNameService(MiniDFSCluster.java:1046)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:921)
> {code}
> This issue is to upgrade Jetty to 9.4 version



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16619) Upgrade jackson and jackson-databind to 2.9.10

2019-10-01 Thread Siyao Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyao Meng resolved HADOOP-16619.
-
Fix Version/s: 3.3.0
   Resolution: Done

> Upgrade jackson and jackson-databind to 2.9.10
> --
>
> Key: HADOOP-16619
> URL: https://issues.apache.org/jira/browse/HADOOP-16619
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
> Fix For: 3.3.0
>
>
> Two more CVEs (CVE-2019-16335 and CVE-2019-14540) are addressed in 
> jackson-databind 2.9.10.
> For details see Jackson Release 2.9.10 [release 
> notes|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.10].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16619) Update jackson-* and jackson-databind to 2.9.10

2019-09-30 Thread Siyao Meng (Jira)
Siyao Meng created HADOOP-16619:
---

 Summary: Update jackson-* and jackson-databind to 2.9.10
 Key: HADOOP-16619
 URL: https://issues.apache.org/jira/browse/HADOOP-16619
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Siyao Meng
Assignee: Siyao Meng


Two more CVEs (CVE-2019-16335 and CVE-2019-14540) are addressed in 
jackson-databind 2.9.10.

For details see Jackson Release 2.9.10 [release 
notes|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.10].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16487) Update jackson-databind to 2.9.9.2

2019-08-02 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16487:
---

 Summary: Update jackson-databind to 2.9.9.2
 Key: HADOOP-16487
 URL: https://issues.apache.org/jira/browse/HADOOP-16487
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Siyao Meng
Assignee: Siyao Meng


Another CVE in jackson-databind:
https://nvd.nist.gov/vuln/detail/CVE-2019-14379

jackson-databind 2.9.9.2 is available: 
https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind

Side note: Here's a discussion jira on whether to remove jackson-databind due 
to the increasing number of CVEs in this dependency recently: HADOOP-16485



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16289) Allow extra jsvc startup option in hadoop_start_secure_daemon in hadoop-functions.sh

2019-05-02 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16289:
---

 Summary: Allow extra jsvc startup option in 
hadoop_start_secure_daemon in hadoop-functions.sh
 Key: HADOOP-16289
 URL: https://issues.apache.org/jira/browse/HADOOP-16289
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 3.1.2, 3.2.0
Reporter: Siyao Meng
Assignee: Siyao Meng


Due to digression in HADOOP-16276 and we might want to pull in more people to 
look at it, I want to speed this up by making a simple change to the script in 
this jira (which would have been included in HADOOP-16276), that is, to add 
HADOOP_DAEMON_JSVC_EXTRA_OPTS to jsvc startup command which allows users to 
specify their extra commands.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16276) Fix jsvc startup command in hadoop-functions.sh due to jsvc >= 1.0.11 changed default current working directory

2019-04-25 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16276:
---

 Summary: Fix jsvc startup command in hadoop-functions.sh due to 
jsvc >= 1.0.11 changed default current working directory
 Key: HADOOP-16276
 URL: https://issues.apache.org/jira/browse/HADOOP-16276
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.1.2, 3.2.0
Reporter: Siyao Meng
Assignee: Siyao Meng


In CDH6, when we bump jsvc from 1.0.10 to 1.1.0 we hit *KerberosAuthException: 
failure to login / LoginException: Unable to obtain password from user* due to 
DAEMON-264 and our 
*dfs.nfs.keytab.file* config uses a relative path. I will probably file another 
jira to issue a warning like *hdfs.keytab not found* before 
KerberosAuthException in this case.

The solution is to add *-cwd $(pwd)* in function hadoop_start_secure_daemon in 
hadoop-functions.sh but I will have to consider the compatibility with older 
jsvc versions <= 1.0.10. Will post the patch after I tested it.

Thanks [~tlipcon] for finding the root cause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16264) [JDK11] Track failing Hadoop unit tests on OpenJDK 11

2019-04-18 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16264:
---

 Summary: [JDK11] Track failing Hadoop unit tests on OpenJDK 11
 Key: HADOOP-16264
 URL: https://issues.apache.org/jira/browse/HADOOP-16264
 Project: Hadoop Common
  Issue Type: Task
Affects Versions: 3.1.2
Reporter: Siyao Meng
Assignee: Siyao Meng


Although there are still a lot of work before we could compile Hadoop with JDK 
11 (HADOOP-15338), it is possible to compile Hadoop with JDK 8 and run (e.g. 
HDFS NN/DN,YARN NM/RM) on JDK 11 at this moment.

But after compiling branch-3.1.2 with JDK 8, I ran unit tests with JDK 11 and 
there are a LOT of unit test failures (44 out of 96 maven projects contain at 
least one unit test failures according to maven reactor summary). This may well 
indicate some functionalities are actually broken on JDK 11. Some of them 
already have a jira number. Some of them might have been fixed in 3.2.0. Some 
of them might share the same root cause.

By definition, this jira should be part of HADOOP-15338. But the goal of this 
one is just to keep track of unit test failures and (hopefully) resolve all of 
them soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16263) Update BUILDING.txt with macOS native build instructions

2019-04-18 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16263:
---

 Summary: Update BUILDING.txt with macOS native build instructions
 Key: HADOOP-16263
 URL: https://issues.apache.org/jira/browse/HADOOP-16263
 Project: Hadoop Common
  Issue Type: Task
Reporter: Siyao Meng
Assignee: Siyao Meng


I recently tried to compile Hadoop native on a Mac and found a few catches, 
involving fixing some YARN native compiling issues (YARN-8622, YARN-9487).

Also, need to specify OpenSSL (brewed) header include dir when building native 
with maven on a Mac. Should update BUILDING.txt for this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match

2019-01-29 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16083:
---

 Summary: DistCp shouldn't always overwrite the target file when 
checksums match
 Key: HADOOP-16083
 URL: https://issues.apache.org/jira/browse/HADOOP-16083
 Project: Hadoop Common
  Issue Type: Improvement
  Components: tools/distcp
Affects Versions: 3.1.1, 3.2.0, 3.3.0
Reporter: Siyao Meng
Assignee: Siyao Meng


{code:java|title=CopyMapper#setup}
...
try {
  overWrite = overWrite || targetFS.getFileStatus(targetFinalPath).isFile();
} catch (FileNotFoundException ignored) {
}
...
{code}

The above code overrides config key "overWrite" to "true" when the target path 
is a file. Therefore, unnecessary transfer happens when the source and target 
file have the same checksums.

My suggestion is: remove the code above. If the user insists to overwrite, just 
add -overwrite in the options:
{code:bash|title=DistCp command with -overwrite option}
hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt 
hdfs://localhost:64464/target/5/6.txt
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16082) FsShell ls: Add option -i to print inode id

2019-01-28 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16082:
---

 Summary: FsShell ls: Add option -i to print inode id
 Key: HADOOP-16082
 URL: https://issues.apache.org/jira/browse/HADOOP-16082
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Reporter: Siyao Meng
Assignee: Siyao Meng


When debugging the FSImage corruption issue, I often need to know a file's or 
directory's inode id. At this moment, the only way to do that is to use OIV 
tool to dump the FSImage and look up the filename, which is very inefficient.

Here I propose adding option "-i" in FsShell that prints files' or directories' 
inode id.

h2. Implementation

h3. For hdfs:// (HDFS)
fileId exists in HdfsLocatedFileStatus, which is already returned to 
hdfs-client. We just need to print it in Ls#processPath().

h3. For file://
h4. Linux
Use java.nio.

h4. Windows
Windows has the concept of "File ID" which is similar to inode id. It is unique 
in NTFS and ReFS.

h3. For other FS
The fileId entry will be "0" in FileStatus if it is not set. We could either 
ignore or throw an exception.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16081) DistCp: Update "Update and Overwrite" doc

2019-01-28 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16081:
---

 Summary: DistCp: Update "Update and Overwrite" doc
 Key: HADOOP-16081
 URL: https://issues.apache.org/jira/browse/HADOOP-16081
 Project: Hadoop Common
  Issue Type: Task
  Components: documentation, tools/distcp
Affects Versions: 3.1.1
Reporter: Siyao Meng
Assignee: Siyao Meng


https://hadoop.apache.org/docs/r3.1.1/hadoop-distcp/DistCp.html#Update_and_Overwrite

In the current doc, it says that -update or -overwrite won't copy the directory 
hierarchies. i.e. the file structure will be "flattened out" on the 
destination. But this has been improved already. (Need to find the jira id that 
made this change.) The dir structure WILL be copied over when -update or 
-overwrite option is in use.

Now the only caveat for -update or -overwrite option is when we are specifying 
multiple sources, there shouldn't be files or directories with same relative 
path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16071) Fix typo in DistCp Counters - Bandwidth in Bytes

2019-01-24 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16071:
---

 Summary: Fix typo in DistCp Counters - Bandwidth in Bytes
 Key: HADOOP-16071
 URL: https://issues.apache.org/jira/browse/HADOOP-16071
 Project: Hadoop Common
  Issue Type: Bug
  Components: tools/distcp
Affects Versions: 3.2.0
Reporter: Siyao Meng
Assignee: Siyao Meng


{code:bash|title=DistCp MR Job Counters}
...
DistCp Counters
Bandwidth in Btyes=20971520
Bytes Copied=20971520
Bytes Expected=20971520
Files Copied=1
{code}

{noformat}
Bandwidth in Btyes -> Bandwidth in Bytes
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16037) DistCp: Document usage of -diff option in detail

2019-01-08 Thread Siyao Meng (JIRA)
Siyao Meng created HADOOP-16037:
---

 Summary: DistCp: Document usage of -diff option in detail
 Key: HADOOP-16037
 URL: https://issues.apache.org/jira/browse/HADOOP-16037
 Project: Hadoop Common
  Issue Type: Task
  Components: documentation, tools/distcp
Reporter: Siyao Meng
Assignee: Siyao Meng


Create a new doc section similar to "Update and Overwrite" for -diff option. 
Provide step by step guidance.

Current doc link: 
https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org