Re: About 2.7.4 Release

2017-05-22 Thread Brahma Reddy Battula
Hi Konstantin Shvachko


how about creating a wiki page for 2.7.4 release status like 2.8 and trunk in 
following link.??


https://cwiki.apache.org/confluence/display/HADOOP



From: Konstantin Shvachko 
Sent: Saturday, May 13, 2017 3:58 AM
To: Akira Ajisaka
Cc: Hadoop Common; Hdfs-dev; mapreduce-dev@hadoop.apache.org; 
yarn-...@hadoop.apache.org
Subject: Re: About 2.7.4 Release

Latest update on the links and filters. Here is the correct link for the
filter:
https://issues.apache.org/jira/secure/IssueNavigator.jspa?requestId=12340814

Also updated: https://s.apache.org/Dzg4

Had to do some Jira debugging. Sorry for confusion.

Thanks,
--Konstantin

On Wed, May 10, 2017 at 2:30 PM, Konstantin Shvachko 
wrote:

> Hey Akira,
>
> I didn't have private filters. Most probably Jira caches something.
> Your filter is in the right direction, but for some reason it lists only
> 22 issues, while mine has 29.
> It misses e.g. YARN-5543 
> .
>
> Anyways, I created a Jira filter now "Hadoop 2.7.4 release blockers",
> shared it with "everybody", and updated my link to point to that filter. So
> you can use any of the three methods below to get the correct list:
> 1. Go to https://s.apache.org/Dzg4
> 2. Go to the filter via
> https://issues.apache.org/jira/issues?filter=12340814
>or by finding "Hadoop 2.7.4 release blockers" filter in the jira
> 3. On Advanced issues search page paste this:
> project in (HDFS, HADOOP, YARN, MAPREDUCE) AND labels = release-blocker
> AND "Target Version/s" = 2.7.4
>
> Hope this solves the confusion for which issues are included.
> Please LMK if it doesn't, as it is important.
>
> Thanks,
> --Konstantin
>
> On Tue, May 9, 2017 at 9:58 AM, Akira Ajisaka  wrote:
>
>> Hi Konstantin,
>>
>> Thank you for volunteering as release manager!
>>
>> > Actually the original link works fine: https://s.apache.org/Dzg4
>> I couldn't see the link. Maybe is it private filter?
>>
>> Here is a link I generated: https://s.apache.org/ehKy
>> This filter includes resolved issue and excludes fixversion == 2.7.4
>>
>> Thanks and Regards,
>> Akira
>>
>> On 2017/05/08 19:20, Konstantin Shvachko wrote:
>>
>>> Hi Brahma Reddy Battula,
>>>
>>> Actually the original link works fine: https://s.apache.org/Dzg4
>>> Your link excludes closed and resolved issues, which needs backporting,
>>> and
>>> which we cannot reopen, as discussed in this thread earlier.
>>>
>>> Looked through the issues you proposed:
>>>
>>> HDFS-9311 
>>> Seems like a new feature. It helps failover to standby node when primary
>>> is
>>> under heavy load, but it introduces new APIs, addresses, config
>>> parameters.
>>> And needs at least one follow up jira.
>>> Looks like a backward compatible change, though.
>>> Did you have a chance to run it in production?
>>>
>>> +1 on
>>> HDFS-10987 
[HDFS-10987] Make Decommission less expensive when lot of 
...
issues.apache.org
When user want to decommission a node which having 50M blocks ,it could hold 
the namesystem lock for long time.We've seen it is taking 36 sec. As we knew 
during this ...



>>> HDFS-9902 
[HDFS-9902] Support different values of dfs.datanode.du 
...
issues.apache.org
Now Hadoop support different storage type for DISK, SSD, ARCHIVE and RAM_DISK, 
but they share one configuration dfs.datanode.du.reserved. The DISK size may be 
several ...



>>> HDFS-8312 
Trash does not descent into child directories to check for 
...
issues.apache.org
HDFS trash does not descent into child directory to check if user has 
permission to delete files. For example: Run the following command to 
initialize directory ...



>>> HADOOP-14100 
Upgrade Jsch jar to latest version to fix vulnerability in 
...
issues.apache.org
Recently there was on vulnerability reported on jsch library. Its fixed in 
latest 0.1.54 version before CVE was made public. 
https://cve.mitre.org/cgi-bin/cvename.cgi ...



>>>
>>> Added them to 2.7.4 release. You should see them via the above link now.
>>> Would be good if you could attach backport patches for some of them?
>>>
>>> Appreciate your help,
>>> --Konstantin
>>>
>>> On Mon, May 8, 2017 at 8:39 AM, Brahma Reddy Battula <
>>> brahmareddy.batt...@huawei.com> wrote:
>>>
>>>
 Looks following link is not correct..

 https://s.apache.org/Dzg4

 It should be like following..?

 https://s.apache.org/wi3U


 Apart from Konstantin mentioned,Following 

Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

2017-05-22 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/322/

[May 21, 2017 12:39:13 PM] (sunilg) YARN-5705. Show timeline data from ATS v2 
in new web UI. Contributed by
[May 22, 2017 4:04:06 AM] (yufei.gu) YARN-6111. Rumen input does't work in SLS. 
Contributed by Yufei Gu.
[May 22, 2017 8:40:06 AM] (sunilg) YARN-6584. Correct license headers in 
hadoop-common, hdfs, yarn and


[Error replacing 'FILE' - Workspace is not accessible]

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

[jira] [Created] (MAPREDUCE-6891) TextInputFormat: duplicate records with custom delimiter

2017-05-22 Thread JIRA
Till Schäfer created MAPREDUCE-6891:
---

 Summary: TextInputFormat: duplicate records with custom delimiter
 Key: MAPREDUCE-6891
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6891
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Till Schäfer


When using a custom delimiter for TextInputFormat, the resulting blocks are not 
correct under some circumstances. It happens that the total number of records 
is wrong and some entries are duplicated.

I have created a reproducible test case: 

Generate a File
{code:bash}
for i in $(seq 1 1000); do 
  echo -n $i >> long_delimiter-1to1000-with_newline.txt;
  echo "" >> 
long_delimiter-1to1000-with_newline.txt; 
done
{code} 

Java-Test to reproduce the error
{code:java}
public static void longDelimiterBug(JavaSparkContext sc) {
Configuration hadoopConf = new Configuration();
String delimitedFile = "long_delimiter-1to1000-with_newline.txt";
hadoopConf.set("textinputformat.record.delimiter", 
"\n");
JavaPairRDD input = 
sc.newAPIHadoopFile(delimitedFile, TextInputFormat.class,
LongWritable.class, Text.class, hadoopConf);

List values = input.map(t -> t._2.toString()).collect();

Assert.assertEquals(1000, values.size());
for (int i = 0; i < 1000; i++) {
boolean correct = values.get(i).equals(Integer.toString(i + 1));
if (!correct) {
logger.error("Wrong value for index {}: expected {} -> 
got {}", i, i + 1, values.get(i));
} else {
logger.info("Correct value for index {}: expected {} -> 
got {}", i, i + 1, values.get(i));
}
Assert.assertTrue(correct);
}
}
{code}

This example fails with the error 
{quote}
java.lang.AssertionError: expected:<1000> but was:<10042616>
{quote}

when commenting out the Assert about the size of the collection, my log output 
ends like this: 
{quote}
[main] INFO  edu.udo.cs.schaefer.testspark.Main  - Correct value for index 
663244: expected 663245 -> got 663245
[main] ERROR edu.udo.cs.schaefer.testspark.Main  - Wrong value for index 
663245: expected 663246 -> got 660111
{quote}

After the the wrong value for index 663245 the values are sorted again an a 
continuing with 660112, 660113, 

The error is not reproducible with _\n_ as delimiter, i.e. when not using a 
custom delimiter. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6886) Job History File Permissions configurable

2017-05-22 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen resolved MAPREDUCE-6886.
---
Resolution: Duplicate

Closing this as a duplicate of MAPREDUCE-6288.  [~Prabhu Joseph] You can 
suggest this in MAPREDUCE-6288, see what people think.

> Job History File Permissions configurable
> -
>
> Key: MAPREDUCE-6886
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6886
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: Prabhu Joseph
>
> Currently the mapreduce job history files are written with 770 permissions 
> which can be accessed by job user or other user part of hadoop group. 
> Customers has users who are not part of the hadoop group but want to access 
> these history files. We can make it configurable like 770 (Strict) or 755 
> (All) permissions with default 770.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-05-22 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/411/

[May 21, 2017 8:23:35 AM] (liuml07) HDFS-11862. Option -v missing for du 
command in FileSystemShell.md.
[May 21, 2017 12:39:13 PM] (sunilg) YARN-5705. Show timeline data from ATS v2 
in new web UI. Contributed by
[May 22, 2017 4:04:06 AM] (yufei.gu) YARN-6111. Rumen input does't work in SLS. 
Contributed by Yufei Gu.




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-common-project/hadoop-minikdc 
   Possible null pointer dereference in 
org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value of called 
method Dereferenced at 
MiniKdc.java:org.apache.hadoop.minikdc.MiniKdc.delete(File) due to return value 
of called method Dereferenced at MiniKdc.java:[line 368] 

FindBugs :

   module:hadoop-common-project/hadoop-auth 
   
org.apache.hadoop.security.authentication.server.MultiSchemeAuthenticationHandler.authenticate(HttpServletRequest,
 HttpServletResponse) makes inefficient use of keySet iterator instead of 
entrySet iterator At MultiSchemeAuthenticationHandler.java:of keySet iterator 
instead of entrySet iterator At MultiSchemeAuthenticationHandler.java:[line 
192] 

FindBugs :

   module:hadoop-common-project/hadoop-common 
   org.apache.hadoop.crypto.CipherSuite.setUnknownValue(int) 
unconditionally sets the field unknownValue At CipherSuite.java:unknownValue At 
CipherSuite.java:[line 44] 
   org.apache.hadoop.crypto.CryptoProtocolVersion.setUnknownValue(int) 
unconditionally sets the field unknownValue At 
CryptoProtocolVersion.java:unknownValue At CryptoProtocolVersion.java:[line 67] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to return value of 
called method Dereferenced at 
FileUtil.java:org.apache.hadoop.fs.FileUtil.fullyDeleteOnExit(File) due to 
return value of called method Dereferenced at FileUtil.java:[line 118] 
   Possible null pointer dereference in 
org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path, 
File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:org.apache.hadoop.fs.RawLocalFileSystem.handleEmptyDstDirectoryOnWindows(Path,
 File, Path, File) due to return value of called method Dereferenced at 
RawLocalFileSystem.java:[line 387] 
   Return value of org.apache.hadoop.fs.permission.FsAction.or(FsAction) 
ignored, but method has no side effect At FTPFileSystem.java:but method has no 
side effect At FTPFileSystem.java:[line 421] 
   Useless condition:lazyPersist == true at this point At 
CommandWithDestination.java:[line 502] 
   org.apache.hadoop.io.DoubleWritable.compareTo(DoubleWritable) 
incorrectly handles double value At DoubleWritable.java: At 
DoubleWritable.java:[line 78] 
   org.apache.hadoop.io.DoubleWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles double value At DoubleWritable.java:int) 
incorrectly handles double value At DoubleWritable.java:[line 97] 
   org.apache.hadoop.io.FloatWritable.compareTo(FloatWritable) incorrectly 
handles float value At FloatWritable.java: At FloatWritable.java:[line 71] 
   org.apache.hadoop.io.FloatWritable$Comparator.compare(byte[], int, int, 
byte[], int, int) incorrectly handles float value At FloatWritable.java:int) 
incorrectly handles float value At FloatWritable.java:[line 89] 
   Possible null pointer dereference in 
org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) due to return 
value of called method Dereferenced at 
IOUtils.java:org.apache.hadoop.io.IOUtils.listDirectory(File, FilenameFilter) 
due to return value of called method Dereferenced at IOUtils.java:[line 350] 
   org.apache.hadoop.io.erasurecode.ECSchema.toString() makes inefficient 
use of keySet iterator instead of entrySet iterator At ECSchema.java:keySet 
iterator instead of entrySet iterator At ECSchema.java:[line 193] 
   Possible bad parsing of shift operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At 
Utils.java:operation in 
org.apache.hadoop.io.file.tfile.Utils$Version.hashCode() At Utils.java:[line 
398] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsFactory.setInstance(MutableMetricsFactory)
 unconditionally sets the field mmfImpl At DefaultMetricsFactory.java:mmfImpl 
At DefaultMetricsFactory.java:[line 49] 
   
org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.setMiniClusterMode(boolean) 
unconditionally sets the field miniClusterMode At 
DefaultMetricsSystem.java:miniClusterMode At DefaultMetricsSystem.java:[line 
100] 
   Useless object