from:"Harsh J $JIRA$"

[jira] [Reopened] (HADOOP-10434) Is it possible to use "df" to calculate the dfs usage instead of "du"

2016-12-18 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-10434:
--

Reopening to close as Duplicate status vs. Fixed.

> Is it possible to use "df" to calculate the dfs usage instead of "du"
> -
>
> Key: HADOOP-10434
> URL: https://issues.apache.org/jira/browse/HADOOP-10434
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: MaoYuan Xian
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HADOOP-10434-1.patch
>
>
> When we run datanode from the machine with big disk volume, it's found du 
> operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk 
> performance.
> As we use the whole disk for hdfs storage, it is possible calculate volume 
> usage via "df" command. Is it necessary adding the "df" option for usage 
> calculation in hdfs 
> (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-10434) Is it possible to use "df" to calculate the dfs usage instead of "du"

2016-12-18 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-10434.
--
Resolution: Duplicate

> Is it possible to use "df" to calculate the dfs usage instead of "du"
> -
>
> Key: HADOOP-10434
> URL: https://issues.apache.org/jira/browse/HADOOP-10434
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 2.3.0
>Reporter: MaoYuan Xian
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0
>
> Attachments: HADOOP-10434-1.patch
>
>
> When we run datanode from the machine with big disk volume, it's found du 
> operations from org.apache.hadoop.fs.DU's DURefreshThread cost lots of disk 
> performance.
> As we use the whole disk for hdfs storage, it is possible calculate volume 
> usage via "df" command. Is it necessary adding the "df" option for usage 
> calculation in hdfs 
> (org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice)?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-13817) Add a finite shell command timeout to ShellBasedUnixGroupsMapping

2016-11-14 Thread Harsh J (JIRA)

Harsh J created HADOOP-13817:


 Summary: Add a finite shell command timeout to 
ShellBasedUnixGroupsMapping
 Key: HADOOP-13817
 URL: https://issues.apache.org/jira/browse/HADOOP-13817
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.6.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


The ShellBasedUnixGroupsMapping run various {{id}} commands via the 
ShellCommandExecutor modules without a timeout set (its set to 0, which implies 
infinite).

If this command hangs for a long time on the OS end due to an unresponsive 
groups backend or other reasons, it also blocks the handlers that use it on the 
NameNode (or other services that use this class). That inadvertently causes odd 
timeout troubles on the client end where its forced to retry (only to likely 
run into such hangs again with every attempt until at least one command 
returns).

It would be helpful to have a finite command timeout after which we may give up 
on the command and return the result equivalent of no groups found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-8134) DNS claims to return a hostname but returns a PTR record in some cases

2016-10-26 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8134.
-
Resolution: Not A Problem
  Assignee: (was: Harsh J)

This hasn't proven as a problem in late. Closing as stale.

> DNS claims to return a hostname but returns a PTR record in some cases
> --
>
> Key: HADOOP-8134
> URL: https://issues.apache.org/jira/browse/HADOOP-8134
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Priority: Minor
>
> Per Shrijeet on HBASE-4109:
> {quote}
> If you are using an interface anything other than 'default' (literally that 
> keyword) DNS.java's getDefaultHost will return a string which will have a 
> trailing period at the end. It seems javadoc of reverseDns in DNS.java (see 
> below) is conflicting with what that function is actually doing. 
> It is returning a PTR record while claims it returns a hostname. The PTR 
> record always has period at the end , RFC: 
> http://irbs.net/bog-4.9.5/bog47.html
> We make call to DNS.getDefaultHost at more than one places and treat that as 
> actual hostname.
> Quoting HRegionServer for example
> String machineName = DNS.getDefaultHost(conf.get(
> "hbase.regionserver.dns.interface", "default"), conf.get(
> "hbase.regionserver.dns.nameserver", "default"));
> We may want to sanitize the string returned from DNS class. Or better we can 
> take a path of overhauling the way we do DNS name matching all over.
> {quote}
> While HBase has worked around the issue, we should fix the methods that 
> aren't doing what they've intended.
> 1. We fix the method. This may be an 'incompatible change'. But I do not know 
> who outside of us uses DNS classes.
> 2. We fix HDFS's DN at the calling end, cause that is affected by the 
> trailing period in its reporting back to the NN as well (Just affects NN->DN 
> weblinks, non critical).
> For 2, we can close this and open a HDFS JIRA.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-7505) EOFException in RPC stack should have a nicer error message

2016-10-26 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7505.
-
Resolution: Duplicate
  Assignee: (was: Harsh J)

This seems to be taken care (in part) via HADOOP-7346

> EOFException in RPC stack should have a nicer error message
> ---
>
> Key: HADOOP-7505
> URL: https://issues.apache.org/jira/browse/HADOOP-7505
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: ipc
>Affects Versions: 0.23.0
>Reporter: Eli Collins
>Priority: Minor
>
> Lots of user logs involve a user running mismatched versions, and for some 
> reason or another, they get EOFException instead of a proper version mismatch 
> exception. We should be able to catch this at appropriate points, and have a 
> nicer exception message explaining that it's a possible version mismatch, or 
> that they're trying to connect to the incorrect port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-8579) Websites for HDFS and MapReduce both send users to video training resource which is non-public

2016-10-26 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8579.
-
Resolution: Not A Problem
  Assignee: (was: Harsh J)

This does not appear to be a problem after the project re-merge.

> Websites for HDFS and MapReduce both send users to video training resource 
> which is non-public
> --
>
> Key: HADOOP-8579
> URL: https://issues.apache.org/jira/browse/HADOOP-8579
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: website
>Reporter: David L. Willson
>Priority: Minor
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Main pages for HDFS and MapReduce send new user to unavailable training 
> resource.
> These two pages:
> http://hadoop.apache.org/mapreduce/
> http://hadoop.apache.org/hdfs/
> Link to this page:
> http://vimeo.com/3584536
> That page is not public, and not shared to all registered Vimeo users, and I 
> see nothing indicating how to ask for access to the resource.
> Please make the vids public, or remove the link of disappointment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Resolved] (HADOOP-8863) Eclipse plugin may not be working on Juno due to changes in it

2016-10-26 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8863.
-
Resolution: Won't Fix
  Assignee: (was: Harsh J)

The eclipse plugin is formally out.

> Eclipse plugin may not be working on Juno due to changes in it
> --
>
> Key: HADOOP-8863
> URL: https://issues.apache.org/jira/browse/HADOOP-8863
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: contrib/eclipse-plugin
>Affects Versions: 1.2.0
>Reporter: Harsh J
>
> We need to debug/investigate why it is so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[jira] [Created] (HADOOP-13515) Redundant transitionToActive call can cause a NameNode to crash

2016-08-18 Thread Harsh J (JIRA)

Harsh J created HADOOP-13515:


 Summary: Redundant transitionToActive call can cause a NameNode to 
crash
 Key: HADOOP-13515
 URL: https://issues.apache.org/jira/browse/HADOOP-13515
 Project: Hadoop Common
  Issue Type: Bug
  Components: ha
Affects Versions: 2.5.0
Reporter: Harsh J
Priority: Minor


The situation in parts is similar to HADOOP-8217, but the cause is different 
and so is the result.

Consider this situation:

- At the beginning NN1 is Active, NN2 is Standby
- ZKFC1 faces a ZK disconnect (not a session timeout, just a socket disconnect) 
and thereby reconnects

{code}
2016-08-11 07:00:46,068 INFO org.apache.zookeeper.ClientCnxn: Client session 
timed out, have not heard from server in 4000ms for sessionid 
0x4566f0c97500bd9, closing socket connection and attempting reconnect
2016-08-11 07:00:46,169 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
disconnected. Entering neutral mode...
…
2016-08-11 07:00:46,610 INFO org.apache.hadoop.ha.ActiveStandbyElector: Session 
connected.
{code}

- The reconnection on the ZKFC1 triggers the elector code, and the elector 
re-run finds that NN1 should be the new active (a redundant decision cause NN1 
is already active)

{code}
2016-08-11 07:00:46,615 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old 
node exists: …
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: But old 
node has our own data, so don't need to fence it.
{code}

- The ZKFC1 sets the new ZK data, and fires a NN1 RPC call of transitionToActive

{code}
2016-08-11 07:00:46,630 INFO org.apache.hadoop.ha.ActiveStandbyElector: Writing 
znode /hadoop-ha/nameservice1/ActiveBreadCrumb to indicate that the local node 
is the most recent active...
2016-08-11 07:00:46,649 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 175: 
Call -> nn01/10.10.10.10:8022: transitionToActive {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
{code}

- At the same time as the transitionToActive call is in progress at NN1, but 
not complete yet, the ZK session of ZKFC1 is timed out by ZK Quorum, and a 
watch notification is sent to ZKFC2

{code}
2016-08-11 07:01:00,003 DEBUG org.apache.zookeeper.ClientCnxn: Got notification 
sessionid:0x4566f0c97500bde
2016-08-11 07:01:00,004 DEBUG org.apache.zookeeper.ClientCnxn: Got WatchedEvent 
state:SyncConnected type:NodeDeleted 
path:/hadoop-ha/nameservice1/ActiveStandbyElectorLock for sessionid 
0x4566f0c97500bde
{code}

- ZKFC2 responds by marking NN2 as standby, which succeeds (NN hasn't handled 
transitionToActive call yet due to busy status, but has handled 
transitionToStandby before it)

{code}
2016-08-11 07:01:00,013 INFO org.apache.hadoop.ha.ActiveStandbyElector: 
Checking for any old active which needs to be fenced...
2016-08-11 07:01:00,018 INFO org.apache.hadoop.ha.ZKFailoverController: Should 
fence: NameNode at nn01/10.10.10.10:8022
2016-08-11 07:01:00,020 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: 
Call -> nn01/10.10.10.10:8022: transitionToStandby {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
2016-08-11 07:01:03,880 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: 
transitionToStandby took 3860ms
{code}

- ZKFC2 then marks NN2 as active, and NN2 begins its transition (is in midst of 
it, not done yet at this point)

{code}
2016-08-11 07:01:03,894 INFO org.apache.hadoop.ha.ZKFailoverController: Trying 
to make NameNode at nn02/11.11.11.11:8022 active...
2016-08-11 07:01:03,895 TRACE org.apache.hadoop.ipc.ProtobufRpcEngine: 412: 
Call -> nn02/11.11.11.11:8022: transitionToActive {reqInfo { reqSource: 
REQUEST_BY_ZKFC }}
…
{code}

{code}
2016-08-11 07:01:09,558 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
…
2016-08-11 07:01:19,968 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
{code}

- At the same time in parallel NN1 processes the transitionToActive requests 
finally, and becomes active

{code}
2016-08-11 07:01:13,281 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Starting services required 
for active state
…
2016-08-11 07:01:19,599 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
…
2016-08-11 07:01:19,602 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: 
Starting log segment at 5635
{code}

- NN2's active transition fails as a result of this parallel active transition 
on NN1 which has completed right before it tries to take over

{code}
2016-08-11 07:01:19,968 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Will take over writing 
edit logs at txnid 5635
2016-08-11 07:01:22,799 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: 
Error encountered requiring NN

[jira] [Created] (HADOOP-13056) Print expected values when rejecting a server's determined principal

2016-04-22 Thread Harsh J (JIRA)

Harsh J created HADOOP-13056:


 Summary: Print expected values when rejecting a server's 
determined principal
 Key: HADOOP-13056
 URL: https://issues.apache.org/jira/browse/HADOOP-13056
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.5.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial


When an address-constructed service principal by a client does not match a 
provided pattern or the configured principal property, the error is very 
uninformative on what the specific cause is. Currently the only error printed 
is, in both cases:

{code}
 java.lang.IllegalArgumentException: Server has invalid Kerberos principal: 
hdfs/host.internal@REALM
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-13051) Test for special characters in path being respected during globPaths

2016-04-22 Thread Harsh J (JIRA)

Harsh J created HADOOP-13051:


 Summary: Test for special characters in path being respected 
during globPaths
 Key: HADOOP-13051
 URL: https://issues.apache.org/jira/browse/HADOOP-13051
 Project: Hadoop Common
  Issue Type: Test
  Components: fs
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


On {{branch-2}}, the below is the (incorrect) behaviour today, where paths with 
special characters get dropped during globStatus calls:

{code}
bin/hdfs dfs -mkdir /foo
bin/hdfs dfs -touchz /foo/foo1
bin/hdfs dfs -touchz $'/foo/foo1\r'
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
{code}

Whereas trunk has the right behaviour, subtly fixed via the pattern library 
change of HADOOP-12436:

{code}
bin/hdfs dfs -mkdir /foo
bin/hdfs dfs -touchz /foo/foo1
bin/hdfs dfs -touchz $'/foo/foo1\r'
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
bin/hdfs dfs -ls '/foo/*'
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1
-rw-r--r--   3 harsh supergroup  0 2016-04-22 17:35 /foo/foo1^M
{code}

(I've placed a ^M explicitly to indicate presence of the intentional hidden 
character)

We should still add a simple test-case to cover this situation for future 
regressions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-12970) Intermittent signature match failures in S3AFileSystem due connection closure

2016-03-28 Thread Harsh J (JIRA)

Harsh J created HADOOP-12970:


 Summary: Intermittent signature match failures in S3AFileSystem 
due connection closure
 Key: HADOOP-12970
 URL: https://issues.apache.org/jira/browse/HADOOP-12970
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.7.0
Reporter: Harsh J
Assignee: Harsh J


S3AFileSystem's use of {{ObjectMetadata#clone()}} method inside the 
{{copyFile}} implementation may fail in circumstances where the connection used 
for obtaining the metadata is closed by the server (i.e. response carries a 
{{Connection: close}} header). Due to this header not being stripped away when 
the {{ObjectMetadata}} is created, and due to us cloning it for use in the next 
{{CopyObjectRequest}}, it causes the request to use {{Connection: close}} 
headers as a part of itself.

This causes signer related exceptions because the client now includes the 
{{Connection}} header as part of the {{SignedHeaders}}, but the S3 server does 
not receive the same value for it ({{Connection}} headers are likely stripped 
away before the S3 Server tries to match signature hashes), causing a failure 
like below:

{code}
2016-03-29 19:59:30,120 DEBUG [s3a-transfer-shared--pool1-t35] 
org.apache.http.wire: >> "Authorization: AWS4-HMAC-SHA256 
Credential=XXX/20160329/eu-central-1/s3/aws4_request, 
SignedHeaders=accept-ranges;connection;content-length;content-type;etag;host;last-modified;user-agent;x-amz-acl;x-amz-content-sha256;x-amz-copy-source;x-amz-date;x-amz-metadata-directive;x-amz-server-side-encryption;x-amz-version-id,
 Signature=MNOPQRSTUVWXYZ[\r][\n]"
…
com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we 
calculated does not match the signature you provided. Check your key and 
signing method. (Service: Amazon S3; Status Code: 403; Error Code: 
SignatureDoesNotMatch; Request ID: ABC), S3 Extended Request ID: XYZ
{code}

This is intermittent because the S3 Server does not always add a {{Connection: 
close}} directive in its response, but whenever we receive it AND we clone it, 
the above exception would happen for the copy request. The copy request is 
often used in the context of FileOutputCommitter, when a lot of the MR attempt 
files on {{s3a://}} destination filesystem are to be moved to their parent 
directories post-commit.

I've also submitted a fix upstream with AWS Java SDK to strip out the 
{{Connection}} headers when dealing with {{ObjectMetadata}}, which is pending 
acceptance and release at: https://github.com/aws/aws-sdk-java/pull/669, but 
until that release is available and can be used by us, we'll need to workaround 
the clone approach by manually excluding the {{Connection}} header (not 
straight-forward due to the {{metadata}} object being private with no mutable 
access). We can remove such a change in future when there's a release available 
with the upstream fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-12894) Add yarn.app.mapreduce.am.log.level to mapred-default.xml

2016-03-05 Thread Harsh J (JIRA)

Harsh J created HADOOP-12894:


 Summary: Add yarn.app.mapreduce.am.log.level to mapred-default.xml
 Key: HADOOP-12894
 URL: https://issues.apache.org/jira/browse/HADOOP-12894
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.9.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-12549) Extend HDFS-7456 default generically to all pattern lookups

2015-11-03 Thread Harsh J (JIRA)

Harsh J created HADOOP-12549:


 Summary: Extend HDFS-7456 default generically to all pattern 
lookups
 Key: HADOOP-12549
 URL: https://issues.apache.org/jira/browse/HADOOP-12549
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc, security
Affects Versions: 2.7.1
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


In HDFS-7546 we added a hdfs-default.xml property to bring back the regular 
behaviour of trusting all principals (as was the case before HADOOP-9789). 
However, the change only targeted HDFS users and also only those that used the 
default-loading mechanism of Configuration class (i.e. not {{new 
Configuration(false)}} users).

I'd like to propose adding the same default to the generic RPC client code 
also, so the default affects all form of clients equally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HADOOP-9461) JobTracker and NameNode both grant delegation tokens to non-secure clients

2015-03-25 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-9461.
-
Resolution: Won't Fix

Not an issue on trunk/branch-2.

JobTracker and NameNode both grant delegation tokens to non-secure clients
--

Key: HADOOP-9461
URL: https://issues.apache.org/jira/browse/HADOOP-9461
Project: Hadoop Common
Issue Type: Bug
Components: security
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor

If one looks at the MAPREDUCE-1516 added logic in JobTracker.java's
isAllowedDelegationTokenOp() method, and apply non-secure states of
UGI.isSecurityEnabled == false and authMethod == SIMPLE, the return result is
true when the intention is false (due to the shorted conditionals).
This is allowing non-secure JobClients to easily request and use
DelegationTokens and cause unwanted errors to be printed in the JobTracker
when the renewer attempts to run. Ideally such clients ought to get an error
if they request a DT in non-secure mode.
HDFS in trunk and branch-1 both too have the same problem. Trunk MR
(HistoryServer) and YARN are however, unaffected due to a simpler, inlined
logic instead of reuse of this faulty method.
Note that fixing this will break Oozie today, due to the merged logic of
OOZIE-734. Oozie will require a fix as well if this is to be fixed in
branch-1. As a result, I'm going to mark this as an Incompatible Change.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-11512) Use getTrimmedStrings when reading serialization keys

2015-01-27 Thread Harsh J (JIRA)

Harsh J created HADOOP-11512:


 Summary: Use getTrimmedStrings when reading serialization keys
 Key: HADOOP-11512
 URL: https://issues.apache.org/jira/browse/HADOOP-11512
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 2.6.0
Reporter: Harsh J
Priority: Minor


In the file 
{{hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/serializer/SerializationFactory.java}},
 we grab the IO_SERIALIZATIONS_KEY config as Configuration#getStrings(…) which 
does not trim the input. This could cause confusing user issues if someone 
manually overrides the key in the XML files/Configuration object without using 
the dynamic approach.

The call should instead use Configuration#getTrimmedStrings(…), so the 
whitespace is trimmed before the class names are searched on the classpath.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-11488) Difference in default connection timeout for S3A FS

2015-01-18 Thread Harsh J (JIRA)

Harsh J created HADOOP-11488:


 Summary: Difference in default connection timeout for S3A FS
 Key: HADOOP-11488
 URL: https://issues.apache.org/jira/browse/HADOOP-11488
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.6.0
Reporter: Harsh J
Priority: Minor


The core-default.xml defines fs.s3a.connection.timeout as 5000, and the code 
under hadoop-tools/hadoop-aws defines it as 5.

We should update the former to 50s so it gets taken properly, as we're also 
noticing that 5s is often too low, especially in cases such as large DistCp 
operations (which fail with {{Read timed out}} errors from the S3 service).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HADOOP-11224) Improve error messages for all permission related failures

2014-10-23 Thread Harsh J (JIRA)

Harsh J created HADOOP-11224:


 Summary: Improve error messages for all permission related failures
 Key: HADOOP-11224
 URL: https://issues.apache.org/jira/browse/HADOOP-11224
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.2.0
Reporter: Harsh J
Priority: Trivial


If a bad file create request fails, you get a juicy error self-describing the 
reason almost:

{code}Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Permission denied: user=root, access=WRITE, 
inode=/:hdfs:supergroup:drwxr-xr-x{code}

However, if a setPermission fails, one only gets a vague:

{code}Caused by: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
 Permission denied{code}

It would be nicer if all forms of permission failures logged the accessed inode 
and current ownership and permissions in the same way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HADOOP-8719) Workaround for kerberos-related log errors upon running any hadoop command on OSX

2014-07-11 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-8719.
-

Resolution: Fixed

When this was committed, OSX was not a targeted platform for security or native
support. If that has changed recently, lets revert this fix over a new JIRA - I
see no issues with doing that. The fix here merely got rid of a verbose warning
appearing unnecessarily over unsecured pseudo-distributed clusters running on
OSX.

Re-resolving. Thanks!

Workaround for kerberos-related log errors upon running any hadoop command on
OSX
-

Key: HADOOP-8719
URL: https://issues.apache.org/jira/browse/HADOOP-8719
Project: Hadoop Common
Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Environment: Mac OS X 10.7, Java 1.6.0_26
Reporter: Jianbin Wei
Priority: Trivial
Fix For: 3.0.0

Attachments: HADOOP-8719.patch, HADOOP-8719.patch, HADOOP-8719.patch,
HADOOP-8719.patch

When starting Hadoop on OS X 10.7 (Lion) using start-all.sh, Hadoop logs
the following errors:
2011-07-28 11:45:31.469 java[77427:1a03] Unable to load realm info from
SCDynamicStore
Hadoop does seem to function properly despite this.
The workaround takes only 10 minutes.
There are numerous discussions about this:
google Unable to load realm mapping info from SCDynamicStore returns 1770
hits. Each one has many discussions.
Assume each discussion take only 5 minute, a 10-minute fix can save ~150
hours. This does not count much search of this issue and its
solution/workaround, which can easily hit (wasted) thousands of hours!!!

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HADOOP-10707) support bzip2 in python avro tool

2014-06-17 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-10707.
--

Resolution: Invalid

Moved to AVRO-1527

 support bzip2 in python avro tool
 -

 Key: HADOOP-10707
 URL: https://issues.apache.org/jira/browse/HADOOP-10707
 Project: Hadoop Common
  Issue Type: Improvement
  Components: tools
Reporter: Eustache
Priority: Minor
  Labels: avro

 The Python tool to decode avro files is currently missing support for bzip2 
 compression.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HADOOP-10572) Example NFS mount command must pass noacl as it isn't supported by the server yet

2014-05-03 Thread Harsh J (JIRA)

Harsh J created HADOOP-10572:


 Summary: Example NFS mount command must pass noacl as it isn't 
supported by the server yet
 Key: HADOOP-10572
 URL: https://issues.apache.org/jira/browse/HADOOP-10572
 Project: Hadoop Common
  Issue Type: Improvement
  Components: nfs
Affects Versions: 2.4.0
Reporter: Harsh J
Priority: Trivial


Use of the documented default mount command results in the below server side 
log WARN event, cause the client tries to locate the ACL program (#100227):

{code}
12:26:11.975 AM TRACE   org.apache.hadoop.oncrpc.RpcCall
Xid:-1114380537, messageType:RPC_CALL, rpcVersion:2, program:100227, version:3, 
procedure:0, credential:(AuthFlavor:AUTH_NONE), verifier:(AuthFlavor:AUTH_NONE)
12:26:11.976 AM TRACE   org.apache.hadoop.oncrpc.RpcProgram 
NFS3 procedure #0
12:26:11.976 AM WARNorg.apache.hadoop.oncrpc.RpcProgram 
Invalid RPC call program 100227
{code}

The client mount command must pass {{noacl}} to avoid this.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (HADOOP-10002) Tool's config option wouldn't work on secure clusters

2013-09-28 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-10002.
--

   Resolution: Duplicate
Fix Version/s: 2.0.3-alpha

Sorry about the noise. This should be fixed by HADOOP-9021 - turns out I wasn't 
looking at the right 2.0.x sources when debugging this.

 Tool's config option wouldn't work on secure clusters
 -

 Key: HADOOP-10002
 URL: https://issues.apache.org/jira/browse/HADOOP-10002
 Project: Hadoop Common
  Issue Type: Bug
  Components: security, util
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor
 Fix For: 2.0.3-alpha


 The Tool framework provides a way for clients to run without classpath 
 *-site.xml configs, by letting users pass -conf file to parse into the 
 app's Configuration object.
 In a secure cluster config setup, such a runner will not work cause of 
 UserGroupInformation.isSecurityEnabled() check, which is used in Server.java 
 to determine what form of communication to use, will load statically a {{new 
 Configuration()}} object to inspect if security is turned on during its 
 initialization, which ignores the application config object and tries to load 
 from classpath and ends up loading non-secure defaults.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HADOOP-10002) Tool's config option wouldn't work on secure clusters

2013-09-27 Thread Harsh J (JIRA)

Harsh J created HADOOP-10002:


 Summary: Tool's config option wouldn't work on secure clusters
 Key: HADOOP-10002
 URL: https://issues.apache.org/jira/browse/HADOOP-10002
 Project: Hadoop Common
  Issue Type: Bug
  Components: security, util
Affects Versions: 2.0.6-alpha
Reporter: Harsh J
Priority: Minor


The Tool framework provides a way for clients to run without classpath 
*-site.xml configs, by letting users pass -conf file to parse into the 
app's Configuration object.

In a secure cluster config setup, such a runner will not work cause of 
UserGroupInformation.isSecurityEnabled() check, which is used in Server.java to 
determine what form of communication to use, will load statically a {{new 
Configuration()}} object to inspect if security is turned on during its 
initialization, which ignores the application config object and tries to load 
from classpath and ends up loading non-secure defaults.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-9878) getting rid of all the 'bin/../' from all the paths

2013-08-27 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-9878:
-


 getting rid of all the 'bin/../' from all the paths
 ---

 Key: HADOOP-9878
 URL: https://issues.apache.org/jira/browse/HADOOP-9878
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: kaveh minooie
Priority: Trivial
 Fix For: 2.1.0-beta

   Original Estimate: 1m
  Remaining Estimate: 1m

 by simply replacing line 34 of libexec/hadoop-config.sh from:
 {quote}
 export HADOOP_PREFIX=`dirname $this`/..
 {quote}
 to 
 {quote}
 export HADOOP_PREFIX=$( cd $config_bin/..; pwd -P )
 {quote}
 we can eliminate all the annoying 'bin/../' from the library paths and make 
 the output of commands like ps a lot more readable. not to mention that OS  
 would do just a bit less work as well. I can post a patch for it as well if 
 it is needed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9878) getting rid of all the 'bin/../' from all the paths

2013-08-27 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-9878.
-

Resolution: Duplicate

 getting rid of all the 'bin/../' from all the paths
 ---

 Key: HADOOP-9878
 URL: https://issues.apache.org/jira/browse/HADOOP-9878
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Reporter: kaveh minooie
Priority: Trivial
   Original Estimate: 1m
  Remaining Estimate: 1m

 by simply replacing line 34 of libexec/hadoop-config.sh from:
 {quote}
 export HADOOP_PREFIX=`dirname $this`/..
 {quote}
 to 
 {quote}
 export HADOOP_PREFIX=$( cd $config_bin/..; pwd -P )
 {quote}
 we can eliminate all the annoying 'bin/../' from the library paths and make 
 the output of commands like ps a lot more readable. not to mention that OS  
 would do just a bit less work as well. I can post a patch for it as well if 
 it is needed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9346) Upgrading to protoc 2.5.0 fails the build

2013-08-13 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-9346.
-

Resolution: Duplicate

Thanks for pinging Ravi. I'd discussed with Alejandro that this could be 
closed. Looks like we added a dupe link but failed to close. Closing now.

 Upgrading to protoc 2.5.0 fails the build
 -

 Key: HADOOP-9346
 URL: https://issues.apache.org/jira/browse/HADOOP-9346
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor
  Labels: protobuf
 Attachments: HADOOP-9346.patch


 Reported over the impala lists, one of the errors received is:
 {code}
 src/hadoop-common-project/hadoop-common/target/generated-sources/java/org/apache/hadoop/ha/proto/ZKFCProtocolProtos.java:[104,37]
  can not find symbol.
 symbol： class Parser
 location： package com.google.protobuf
 {code}
 Worth looking into as we'll eventually someday bump our protobuf deps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9861) Invert ReflectionUtils' stack trace

2013-08-10 Thread Harsh J (JIRA)

Harsh J created HADOOP-9861:
---

 Summary: Invert ReflectionUtils' stack trace
 Key: HADOOP-9861
 URL: https://issues.apache.org/jira/browse/HADOOP-9861
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 2.0.5-alpha
Reporter: Harsh J


Often an MR task (as an example) may fail at the configure stage due to a 
misconfiguration or whatever, and the only thing a user gets by virtue of MR 
pulling limited bytes of the diagnostic error data is the top part of the 
stacktrace:

{code}
java.lang.RuntimeException: Error in configuring object
at 
org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
{code}

This is absolutely useless to a user, and he also goes ahead and blames the 
framework for having an issue, rather than thinking (non-intuitively) to go see 
the whole task log for the full trace, especially the last part.

Hundreds of time its been a mere class thats missing, etc. but there's just too 
much pain involved here to troubleshoot.

Would be much much better, if we inverted the trace. For example, here's what 
Hive can return back if we did so, for a random trouble I pulled from the web:

{code}
java.lang.RuntimeException: Error in configuring object
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:64)
at java.lang.String.valueOf(String.java:2826)
at java.lang.StringBuilder.append(StringBuilder.java:115)
at 
org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:563)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:100)
... 22 more
{code}

This way the user can at least be sure what part's really failing, and not get 
lost trying to work their way through reflection utils and upwards/downwards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9567) Provide auto-renewal for keytab based logins

2013-05-16 Thread Harsh J (JIRA)

Harsh J created HADOOP-9567:
---

 Summary: Provide auto-renewal for keytab based logins
 Key: HADOOP-9567
 URL: https://issues.apache.org/jira/browse/HADOOP-9567
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor


We do a renewal for cached tickets (obtained via kinit before using a Hadoop 
application) but we explicitly seem to avoid doing a renewal for keytab based 
logins (done from within the client code) when we could do that as well via a 
similar thread.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9510) DU command should provide a -h flag to display a more human readable format.

2013-04-25 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-9510.
-

Resolution: Not A Problem

This is already available in the revamped shell apps under 2.x releases today; 
see 
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#du

 DU command should provide a -h flag to display a more human readable format.
 

 Key: HADOOP-9510
 URL: https://issues.apache.org/jira/browse/HADOOP-9510
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Corey J. Nolet
Priority: Minor

 Would be useful to have the sizes print out as 500M or 3.4G instead of bytes 
 only.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9496) Bad merge of HADOOP-9450 on branch-2 breaks all bin/hadoop calls that need HADOOP_CLASSPATH

2013-04-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-9496.
-

   Resolution: Fixed
Fix Version/s: 2.0.5-beta

Committed revision 1471230 to fix this properly.

 Bad merge of HADOOP-9450 on branch-2 breaks all bin/hadoop calls that need 
 HADOOP_CLASSPATH 
 

 Key: HADOOP-9496
 URL: https://issues.apache.org/jira/browse/HADOOP-9496
 Project: Hadoop Common
  Issue Type: Bug
  Components: bin
Affects Versions: 2.0.5-beta
Reporter: Gopal V
Assignee: Harsh J
Priority: Critical
 Fix For: 2.0.5-beta

 Attachments: HADOOP-9496.patch


 Merge of HADOOP-9450 to branch-2 is broken for hadoop-config.sh
 on trunk
 http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh?r1=1453486r2=1469214pathrev=1469214
 vs on branch-2
 http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-common-project/hadoop-common/src/main/bin/hadoop-config.sh?r1=1390222r2=1469215
 This is breaking all hadoop client code which needs HADOOP_CLASSPATH to be 
 set correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9461) JobTracker and NameNode both grant delegation tokens to non-secure clients

2013-04-06 Thread Harsh J (JIRA)

Harsh J created HADOOP-9461:
---

 Summary: JobTracker and NameNode both grant delegation tokens to 
non-secure clients
 Key: HADOOP-9461
 URL: https://issues.apache.org/jira/browse/HADOOP-9461
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


If one looks at the MAPREDUCE-1516 added logic in JobTracker.java's 
isAllowedDelegationTokenOp() method, and apply non-secure states of 
UGI.isSecurityEnabled == false and authMethod == SIMPLE, the return result is 
true when the intention is false (due to the shorted conditionals).

This is allowing non-secure JobClients to easily request and use 
DelegationTokens and cause unwanted errors to be printed in the JobTracker when 
the renewer attempts to run. Ideally such clients ought to get an error if they 
request a DT in non-secure mode.

HDFS in trunk and branch-1 both too have the same problem. Trunk MR 
(HistoryServer) and YARN are however, unaffected due to a simpler, inlined 
logic instead of reuse of this faulty method.

Note that fixing this will break Oozie today, due to the merged logic of 
OOZIE-734. Oozie will require a fix as well if this is to be fixed in branch-1. 
As a result, I'm going to mark this as an Incompatible Change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-2781) Hadoop/Groovy integration

2013-03-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-2781.
-

Resolution: Won't Fix

Closing per comment below and inactive for couple years now:

bq. Grool was a dead end.

Possible alternatives (given FlumeJava's mention): Apache Crunch - 
http://crunch.apache.org and/or Cascading - http://cascading.org.

 Hadoop/Groovy integration
 -

 Key: HADOOP-2781
 URL: https://issues.apache.org/jira/browse/HADOOP-2781
 Project: Hadoop Common
  Issue Type: New Feature
 Environment: Any
Reporter: Ted Dunning
 Attachments: trunk.tgz


 This is a place-holder issue to hold initial release of the groovy 
 integration for hadoop.
 The goal is to be able to write very simple map-reduce programs in just a few 
 lines of code in a functional style.  Word count should be less than 5 lines 
 of code! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9424) The hadoop jar invocation should include the passed jar on the classpath as a whole

2013-03-21 Thread Harsh J (JIRA)

Harsh J created HADOOP-9424:
---

 Summary: The hadoop jar invocation should include the passed jar 
on the classpath as a whole
 Key: HADOOP-9424
 URL: https://issues.apache.org/jira/browse/HADOOP-9424
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.0.3-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


When you have a case such as this:

{{X.jar - Classes = Main, Foo}}
{{Y.jar - Classes = Bar}}

With implementation details such as:

* Main references Bar and invokes a public, static method on it.
* Bar does a class lookup to find Foo (Class.forName(Foo)).

Then when you do a {{HADOOP_CLASSPATH=Y.jar hadoop jar X.jar Main}}, the Bar's 
method fails with a ClassNotFound exception cause of the way RunJar runs.

RunJar extracts the passed jar and includes its contents on the ClassLoader of 
its current thread but the {{Class.forName(…)}} call from another class does 
not check that class loader and hence cannot find the class as its not on any 
classpath it is aware of.

The script of hadoop jar should ideally include the passed jar argument to 
the CLASSPATH before RunJar is invoked, for this above case to pass.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-6942) Ability for having user's classes take precedence over the system classes for tasks' classpath

2013-03-19 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-6942.
-

Resolution: Duplicate

Fixed via MAPREDUCE-1938. Closing as dupe.

 Ability for having user's classes take precedence over the system classes for 
 tasks' classpath
 --

 Key: HADOOP-6942
 URL: https://issues.apache.org/jira/browse/HADOOP-6942
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Affects Versions: 0.22.0
Reporter: Krishna Ramachandran
 Attachments: HADOOP-6942.y20.patch, hadoop-common-6942.patch


 Fix bin/hadoop script to facilitate mapred-1938

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9346) Upgrading to protoc 2.5.0 fails the build

2013-02-28 Thread Harsh J (JIRA)

Harsh J created HADOOP-9346:
---

 Summary: Upgrading to protoc 2.5.0 fails the build
 Key: HADOOP-9346
 URL: https://issues.apache.org/jira/browse/HADOOP-9346
 Project: Hadoop Common
  Issue Type: Task
Reporter: Harsh J
Priority: Minor


Reported over the impala lists, one of the errors received is:

{code}
src/hadoop-common-project/hadoop-common/target/generated-sources/java/org/apache/hadoop/ha/proto/ZKFCProtocolProtos.java:[104,37]
 can not find symbol.
symbol： class Parser
location： package com.google.protobuf
{code}

Worth looking into as we'll eventually someday bump our protobuf deps.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9322) LdapGroupsMapping doesn't seem to set a timeout for its directory search

2013-02-21 Thread Harsh J (JIRA)

Harsh J created HADOOP-9322:
---

 Summary: LdapGroupsMapping doesn't seem to set a timeout for its 
directory search
 Key: HADOOP-9322
 URL: https://issues.apache.org/jira/browse/HADOOP-9322
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Affects Versions: 2.0.3-alpha
Reporter: Harsh J
Priority: Minor


We don't appear to be setting a timeout via 
http://docs.oracle.com/javase/6/docs/api/javax/naming/directory/SearchControls.html#setTimeLimit(int)
 before we search with 
http://docs.oracle.com/javase/6/docs/api/javax/naming/directory/DirContext.html#search(javax.naming.Name,%20java.lang.String,%20javax.naming.directory.SearchControls).

This may occasionally lead to some unwanted NN pauses due to lock-holding on 
the operations that do group lookups. A timeout is better to define than rely 
on 0 (infinite wait).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-9241) DU refresh interval is not configurable

2013-01-29 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-9241:
-


Thanks Nicholas; I have reverted HADOOP-9241 from trunk and branch-2. I will 
attach a proper patch now.

 DU refresh interval is not configurable
 ---

 Key: HADOOP-9241
 URL: https://issues.apache.org/jira/browse/HADOOP-9241
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 2.0.3-alpha

 Attachments: HADOOP-9241.patch


 While the {{DF}} class's refresh interval is configurable, the {{DU}}'s 
 isn't. We should ensure both be configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9257) HADOOP-9241 changed DN's default DU interval to 1m instead of 10m accidentally

2013-01-28 Thread Harsh J (JIRA)

Harsh J created HADOOP-9257:
---

 Summary: HADOOP-9241 changed DN's default DU interval to 1m 
instead of 10m accidentally
 Key: HADOOP-9257
 URL: https://issues.apache.org/jira/browse/HADOOP-9257
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.0.3-alpha
Reporter: Harsh J
Assignee: Harsh J


Suresh caught this on HADOOP-9241:

{quote}
Even for trivial jiras, I suggest getting the code review done before 
committing the code. Such changes are easy and quick to review.
In this patch, did DU interval become 1 minute instead of 10 minutes?
{code}
-this(path, 60L);
-//10 minutes default refresh interval
+this(path, conf.getLong(CommonConfigurationKeys.FS_DU_INTERVAL_KEY,
+CommonConfigurationKeys.FS_DU_INTERVAL_DEFAULT));


+  /** See a href={@docRoot}/../core-default.htmlcore-default.xml/a */
+  public static final String  FS_DU_INTERVAL_KEY = fs.du.interval;
+  /** Default value for FS_DU_INTERVAL_KEY */
+  public static final longFS_DU_INTERVAL_DEFAULT = 6;
{code}
{quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9241) DU refresh interval is not configurable

2013-01-24 Thread Harsh J (JIRA)

Harsh J created HADOOP-9241:
---

 Summary: DU refresh interval is not configurable
 Key: HADOOP-9241
 URL: https://issues.apache.org/jira/browse/HADOOP-9241
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Priority: Trivial


While the {{DF}} class's refresh interval is configurable, the {{DU}}'s isn't. 
We should ensure both be configurable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9243) Some improvements to the mailing lists webpage for lowering unrelated content rate

2013-01-24 Thread Harsh J (JIRA)

Harsh J created HADOOP-9243:
---

 Summary: Some improvements to the mailing lists webpage for 
lowering unrelated content rate
 Key: HADOOP-9243
 URL: https://issues.apache.org/jira/browse/HADOOP-9243
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Harsh J
Priority: Minor


From Steve on HADOOP-9329:

{quote}
* could you add a bit of text to say user@ is not the place to discuss 
installation problems related to any third party products that install some 
variant of Hadoop on people's desktops and servers. You're the one who ends up 
having to bounce off all the CDH-related queries -it would help you too.
* For the new Invalid JIRA link to paste into JIRA issues about this, I point 
to the distributions and Commercial support page on the wiki -something similar 
on the mailing lists page would avoid having to put any specific vendor links 
into the mailing lists page, and support a higher/more open update process. See 
http://wiki.apache.org/hadoop/InvalidJiraIssues
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9239) Move the general@ description to the end of lists in the mailing lists web page

2013-01-23 Thread Harsh J (JIRA)

Harsh J created HADOOP-9239:
---

 Summary: Move the general@ description to the end of lists in the 
mailing lists web page
 Key: HADOOP-9239
 URL: https://issues.apache.org/jira/browse/HADOOP-9239
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Harsh J
Priority: Minor


We have users unnecessarily subscribing to and abusing the general@ list mainly 
cause of its presence as the first option in the page 
http://hadoop.apache.org/mailing_lists.html, and secondarily cause of its name.

This is to at least address the first one that is causing growing pain to its 
subscribers. Lets move it to the bottom of the presented list of lists.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8274) In pseudo or cluster model under Cygwin, tasktracker can not create a new job because of symlink problem.

2013-01-14 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8274.
-

Resolution: Won't Fix

For Windows, since the mainstream branch does not support it actively, am 
closing this as a Won't Fix.

I'm certain the same issue does not happen on the branch-1-win 1.x branch (or 
the branch-trunk-win branch), and I urge you to use that instead if you wish to 
continue using Windows for development or other usage. Find the 
Windows-optimized sources at 
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-1-win/ or 
http://svn.apache.org/repos/asf/hadoop/common/branches/branch-trunk-win/.

 In pseudo or cluster model under Cygwin, tasktracker can not create a new job 
 because of symlink problem.
 -

 Key: HADOOP-8274
 URL: https://issues.apache.org/jira/browse/HADOOP-8274
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.20.205.0, 1.0.0, 1.0.1, 0.22.0
 Environment: windows7+cygwin 1.7.11-1+jdk1.6.0_31+hadoop 1.0.0
Reporter: tim.wu

 The standalone model is ok. But, in pseudo or cluster model, it always throw 
 errors, even I just run wordcount example.
 The HDFS works fine, but tasktracker can not create threads(jvm) for new job. 
  It is empty under /logs/userlogs/job-/attempt-/.
 The reason looks like that in windows, Java can not recognize a symlink of 
 folder as a folder. 
 The detail description is as following,
 ==
 First, the error log of tasktracker is like:
 ==
 12/03/28 14:35:13 INFO mapred.JvmManager: In JvmRunner constructed JVM ID: 
 jvm_201203280212_0005_m_-1386636958
 12/03/28 14:35:13 INFO mapred.JvmManager: JVM Runner 
 jvm_201203280212_0005_m_-1386636958 spawned.
 12/03/28 14:35:17 INFO mapred.JvmManager: JVM Not killed 
 jvm_201203280212_0005_m_-1386636958 but just removed
 12/03/28 14:35:17 INFO mapred.JvmManager: JVM : 
 jvm_201203280212_0005_m_-1386636958 exited with exit code -1. Number of tasks 
 it ran: 0
 12/03/28 14:35:17 WARN mapred.TaskRunner: 
 attempt_201203280212_0005_m_02_0 : Child Error
 java.io.IOException: Task process exit with nonzero status of -1.
 at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
 12/03/28 14:35:21 INFO mapred.TaskTracker: addFreeSlot : current free slots : 
 2
 12/03/28 14:35:24 INFO mapred.TaskTracker: LaunchTaskAction (registerTask): 
 attempt_201203280212_0005_m_02_1 task's state:UNASSIGNED
 12/03/28 14:35:24 INFO mapred.TaskTracker: Trying to launch : 
 attempt_201203280212_0005_m_02_1 which needs 1 slots
 12/03/28 14:35:24 INFO mapred.TaskTracker: In TaskLauncher, current free 
 slots : 2 and trying to launch attempt_201203280212_0005_m_02_1 which 
 needs 1 slots
 12/03/28 14:35:24 WARN mapred.TaskLog: Failed to retrieve stdout log for 
 task: attempt_201203280212_0005_m_02_0
 java.io.FileNotFoundException: 
 D:\cygwin\home\timwu\hadoop-1.0.0\logs\userlogs\job_201203280212_0005\attempt_201203280212_0005_m_02_0\log.index
  (The system cannot find the path specified)
 at java.io.FileInputStream.open(Native Method)
 at java.io.FileInputStream.init(FileInputStream.java:120)
 at 
 org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102)
 at 
 org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:188)
 at org.apache.hadoop.mapred.TaskLog$Reader.init(TaskLog.java:423)
 at 
 org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
 at 
 org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
 at 
 org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:835)
 at 
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
 at

[jira] [Resolved] (HADOOP-7386) Support concatenated bzip2 files

2012-12-10 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7386.
-

Resolution: Duplicate

Thanks for confirming! Resolving as dupe.

 Support concatenated bzip2 files
 

 Key: HADOOP-7386
 URL: https://issues.apache.org/jira/browse/HADOOP-7386
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Allen Wittenauer
Assignee: Karthik Kambatla

 HADOOP-6835 added the framework and direct support for concatenated gzip 
 files.  We should do the same for bzip files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8301) Common (hadoop-tools) side of MAPREDUCE-4172

2012-12-01 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8301.
-

Resolution: Won't Fix

Patches were too broad and have gone stale. Will address these forms of issue 
over separate, smaller and more divided JIRAs in future.

Closing out parent JIRA MAPREDUCE-4172, and hence closing out this.

 Common (hadoop-tools) side of MAPREDUCE-4172
 

 Key: HADOOP-8301
 URL: https://issues.apache.org/jira/browse/HADOOP-8301
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J

 Patches on MAPREDUCE-4172 (for MR-relevant projects) that requires to run off 
 of Hadoop Common project for Hadoop QA.
 One sub-task per hadoop-tools submodule will be added here for reviews.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9091) Allow daemon startup when at least 1 (or configurable) disk is in an OK state.

2012-11-26 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-9091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-9091.
-

Resolution: Fixed

This feature is already available in all our current releases via the DN volume
failure toleration properties. Please see
https://issues.apache.org/jira/browse/HDFS-1592.

Resolving as not a problem. Please update to an inclusive release to have this
addressed in your environment.

Allow daemon startup when at least 1 (or configurable) disk is in an OK state.
--

Key: HADOOP-9091
URL: https://issues.apache.org/jira/browse/HADOOP-9091
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Affects Versions: 0.20.2
Reporter: Jelle Smet
Labels: features, hadoop

The given example is if datanode disk definitions but should be applicable to
all configuration where a list of disks are provided.
I have defined multiple local disks defined for a datanode:
property
namedfs.data.dir/name
value/data/01/dfs/dn,/data/02/dfs/dn,/data/03/dfs/dn,/data/04/dfs/dn,/data/05/dfs/dn,/data/06/dfs/dn/value
finaltrue/final
/property
When one of those disks breaks and is unmounted then the mountpoint (such as
/data/03 in this example) becomes a regular directory which doesn't have the
valid permissions and possible directory structure Hadoop is expecting.
When this situation happens, the datanode fails to restart because of this
while actually we have enough disks in an OK state to proceed. The only way
around this is to alter the configuration and omit that specific disk
configuration.
To my opinion, It would be more practical to let Hadoop daemons start when at
least 1 disks/partition in the provided list is in a usable state. This
prevents having to roll out custom configurations for systems which have
temporarily a disk (and therefor directory layout) missing. This might also
be configurable that at least X partitions out of he available ones are in OK
state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-9066) Sorting for FileStatus[]

2012-11-26 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-9066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-9066.
-

Resolution: Invalid

Since HADOOP-8934 is already adding FileStatus data based sorting in a place 
that matters, and this JIRA seems to just add a simple example of utilizing 
FileStatus comparatives, am resolving this as Invalid at the moment, as the 
example isn't too much of a value (given that the Javadoc already is clear for 
FileStatus, and there's no use-case for this stuff in MR, etc.) so far.

 Sorting for FileStatus[]
 

 Key: HADOOP-9066
 URL: https://issues.apache.org/jira/browse/HADOOP-9066
 Project: Hadoop Common
  Issue Type: Improvement
 Environment: java7 , RedHat9 , Hadoop 0.20.2 
 ,eclipse-jee-juno-linux-gtk.tar.gz
Reporter: david king
  Labels: patch
 Attachments: ConcreteFileStatusAscComparable.java, 
 ConcreteFileStatusDescComparable.java, FileStatusComparable.java, 
 FileStatusTool.java, TestFileStatusTool.java


   I will submit a batch of FileStatusTool that used to sort FileStatus by the 
 Comparator, the Comparator not only customer to realizate , but alse use the 
 example code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9068) Reuse (and not duplicate) globbing logic between FileSystem and FileContext

2012-11-20 Thread Harsh J (JIRA)

Harsh J created HADOOP-9068:
---

 Summary: Reuse (and not duplicate) globbing logic between 
FileSystem and FileContext
 Key: HADOOP-9068
 URL: https://issues.apache.org/jira/browse/HADOOP-9068
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J


FileSystem's globbing code is currently duplicated in FileContext.Util class. 
We should reuse the implementation rather than maintain two pieces of it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-9023) HttpFs is too restrictive on usernames

2012-11-09 Thread Harsh J (JIRA)

Harsh J created HADOOP-9023:
---

 Summary: HttpFs is too restrictive on usernames
 Key: HADOOP-9023
 URL: https://issues.apache.org/jira/browse/HADOOP-9023
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Harsh J


HttpFs tries to use UserProfile.USER_PATTERN to match all usernames before a 
doAs impersonation function. This regex is too strict for most usernames, as it 
disallows any special character at all. We should relax it more or ditch 
needing to match things there.

WebHDFS currently has no such limitations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8927) org.apache.hadoop.hive.jdbc.HiveDriver loads outside of Map Reduce but fails on Map reduce

2012-10-13 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8927.
-

Resolution: Not A Problem

 org.apache.hadoop.hive.jdbc.HiveDriver loads outside of Map Reduce but fails 
 on Map reduce
 --

 Key: HADOOP-8927
 URL: https://issues.apache.org/jira/browse/HADOOP-8927
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf, tools
Affects Versions: 2.0.2-alpha
Reporter: VJ
Priority: Minor



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8876) SequenceFile default compression is RECORD, not BLOCK

2012-10-03 Thread Harsh J (JIRA)

Harsh J created HADOOP-8876:
---

 Summary: SequenceFile default compression is RECORD, not BLOCK
 Key: HADOOP-8876
 URL: https://issues.apache.org/jira/browse/HADOOP-8876
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Harsh J


Currently both the SequenceFile writer and the MR defaults for SequenceFile 
compression default to RECORD type compression, while most recommendations are 
to use BLOCK for smaller end sizes instead.

Should we not change the default?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8863) Eclipse plugin may not be working on Juno due to changes in it

2012-09-28 Thread Harsh J (JIRA)

Harsh J created HADOOP-8863:
---

 Summary: Eclipse plugin may not be working on Juno due to changes 
in it
 Key: HADOOP-8863
 URL: https://issues.apache.org/jira/browse/HADOOP-8863
 Project: Hadoop Common
  Issue Type: Bug
  Components: contrib/eclipse-plugin
Affects Versions: 1.2.0
Reporter: Harsh J
Assignee: Harsh J


We need to debug/investigate why it is so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7941) NoClassDefFoundError while running distcp/archive

2012-09-28 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7941.
-

Resolution: Cannot Reproduce

Doesn't seem to be a problem anymore, both hadoop and mapred scripts are 
running these fine.

Resolving as Cannot Reproduce (anymore).

 NoClassDefFoundError while running distcp/archive
 -

 Key: HADOOP-7941
 URL: https://issues.apache.org/jira/browse/HADOOP-7941
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.23.1
Reporter: Ramya Sunil

 bin/hadoop distcp
 {noformat}
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hadoop/tools/DistCp
 Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.tools.DistCp
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 Could not find the main class: org.apache.hadoop.tools.DistCp.  Program will 
 exit.
 {noformat}
 Same is the case while running 'bin/hadoop archive'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8864) Addendum to HADOOP-8840: Add a coloring case for +0 results too.

2012-09-28 Thread Harsh J (JIRA)

Harsh J created HADOOP-8864:
---

 Summary: Addendum to HADOOP-8840: Add a coloring case for +0 
results too.
 Key: HADOOP-8864
 URL: https://issues.apache.org/jira/browse/HADOOP-8864
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Harsh J
Assignee: Harsh J
 Attachments: HADOOP-8864.patch

Noticed on MAPREDUCE-3223 that we failed to cover coloring the +0 case we print 
sometimes for doc-only patches. These can be colored green too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8864) Addendum to HADOOP-8840: Add a coloring case for +0 results too.

2012-09-28 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8864.
-

   Resolution: Fixed
Fix Version/s: 3.0.0

Since this was a trivial addition, I went ahead and committed it to trunk.

 Addendum to HADOOP-8840: Add a coloring case for +0 results too.
 

 Key: HADOOP-8864
 URL: https://issues.apache.org/jira/browse/HADOOP-8864
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial
 Fix For: 3.0.0

 Attachments: HADOOP-8864.patch


 Noticed on MAPREDUCE-3223 that we failed to cover coloring the +0 case we 
 print sometimes for doc-only patches. These can be colored green too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8838) Colorize the test-patch output sent to JIRA

2012-09-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8838:
---

 Summary: Colorize the test-patch output sent to JIRA
 Key: HADOOP-8838
 URL: https://issues.apache.org/jira/browse/HADOOP-8838
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Reporter: Harsh J
Assignee: Harsh J
Priority: Trivial


It would be helpful to mark the -1s in red and +1s in green. Helps avoid 
missing stuff like findbugs warnings, etc., we've been bitten by. Also helps 
run through the results faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8839) test-patch's -1 on @author tag presence doesn't cause a -1 to the overall result

2012-09-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8839:
---

 Summary: test-patch's -1 on @author tag presence doesn't cause a 
-1 to the overall result
 Key: HADOOP-8839
 URL: https://issues.apache.org/jira/browse/HADOOP-8839
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Harsh J
Priority: Trivial


As observed on HADOOP-8838.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8840) Fix the test-patch colorizer to cover all sorts of +1 lines.

2012-09-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8840:
---

 Summary: Fix the test-patch colorizer to cover all sorts of +1 
lines.
 Key: HADOOP-8840
 URL: https://issues.apache.org/jira/browse/HADOOP-8840
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Harsh J
Assignee: Harsh J


As noticed by Jason on HADOOP-8838, I missed some of the entries needed to be 
colorized.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8844) Add a plaintext fs -text test-case

2012-09-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8844:
---

 Summary: Add a plaintext fs -text test-case
 Key: HADOOP-8844
 URL: https://issues.apache.org/jira/browse/HADOOP-8844
 Project: Hadoop Common
  Issue Type: Test
  Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J


The TestDFSShell's textTest(…) currently tests all sorts of binary and 
compressed files, but doesn't test plaintext files. We should add one test for 
plaintext as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException

2012-09-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8845:
---

 Summary: When looking for parent paths info, globStatus must 
filter out non-directory elements to avoid an AccessControlException
 Key: HADOOP-8845
 URL: https://issues.apache.org/jira/browse/HADOOP-8845
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J


A brief description from my colleague Stephen Fritz who helped discover it:

{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing 
below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test 
file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file 
in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading

-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String -- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String -- success!

-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String -- success!  
Note that we used an '*' in the cat command, and it correctly found the 
subdirectory '/tmp/testdir/1', and ignore the regular file 
'/tmp/testdir/testfile'

-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser -- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x   - hdfs hadoop  0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r--   3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String -- good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String -- so far so good

[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied: 
user=testuser, access=EXECUTE, 
inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r--
{code}

Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile 
cause we tried to access the /tmp/testdir/testfile/testfile as a path. This 
shouldn't happen, as the testfile is a file and not a path parent to be looked 
up upon.

Surprisingly the superuser avoids hitting into the error, as a result of 
bypassing permissions, but that can be looked up on another JIRA - if it is 
fine to let it be like that or not.

This JIRA targets a client-sided fix to not cause such /path/file/dir kinda 
lookups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-7698) jsvc target fails on x86_64

2012-09-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-7698:
-


Dang, I forgot to see this was 1.x related. Reopened to check 1.x now.

 jsvc target fails on x86_64
 ---

 Key: HADOOP-7698
 URL: https://issues.apache.org/jira/browse/HADOOP-7698
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.205.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Attachments: HADOOP-7698-1.patch, HADOOP-7968.patch


 Recent changes to the build.xml determine with jsvc file to download based on 
 the os.arch.  It maps various arch values to i386 or x86_64. However, it 
 notably doesn't consider x86_64 to be x86_64.  The result is this the 
 download fails because {{os-arch}} doesn't expand.
 {code}
 build.xml:2626: Can't get 
 http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz
 {code}
 This breaks {{test-patch}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7698) jsvc target fails on x86_64

2012-09-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7698.
-

   Resolution: Fixed
Fix Version/s: 1.2.0

 jsvc target fails on x86_64
 ---

 Key: HADOOP-7698
 URL: https://issues.apache.org/jira/browse/HADOOP-7698
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.20.205.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical
 Fix For: 1.2.0

 Attachments: HADOOP-7698-1.patch, HADOOP-7968.patch


 Recent changes to the build.xml determine with jsvc file to download based on 
 the os.arch.  It maps various arch values to i386 or x86_64. However, it 
 notably doesn't consider x86_64 to be x86_64.  The result is this the 
 download fails because {{os-arch}} doesn't expand.
 {code}
 build.xml:2626: Can't get 
 http://archive.apache.org/dist/commons/daemon/binaries/1.0.2/linux/commons-daemon-1.0.2-bin-linux-${os-arch}.tar.gz
 {code}
 This breaks {{test-patch}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7542) Change XML format to 1.1 to add support for serializing additional characters

2012-09-23 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7542.
-

  Resolution: Won't Fix
Release Note:   (was: Changes the Configuration file's XML format to 1.1 
from 1.0, which adds support for serializing additional separator characters.)

Thanks Steve. For the original reason of the textoutputformat separator config, 
I've filed MAPREDUCE-4677 for getting that done.

 Change XML format to 1.1 to add support for serializing additional characters
 -

 Key: HADOOP-7542
 URL: https://issues.apache.org/jira/browse/HADOOP-7542
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 0.20.2
Reporter: Suhas Gogate
Assignee: Michael Katzenellenbogen
 Attachments: HADOOP-7542-v1.patch, MAPREDUCE-109.patch, 
 MAPREDUCE-109-v2.patch, MAPREDUCE-109-v3.patch, MAPREDUCE-109-v4.patch


 Feature added by this Jira has a problem while setting up some of the invalid 
 xml characters e.g. ctrl-A e.g. mapred.textoutputformat.separator = \u0001
 e,g,
 String delim = \u0001;
 Conf.set(mapred.textoutputformat.separator, delim);
 Job client serializes the jobconf with mapred.textoutputformat.separator set 
 to \u0001 (ctrl-A) and problem happens when it is de-serialized (read back) 
 by job tracker, where it encounters invalid xml character.
 The test for this feature public : testFormatWithCustomSeparator() does not 
 serialize the jobconf after adding the separator as ctrl-A and hence does not 
 detect the specific problem.
 Here is an exception:
 08/12/06 01:40:50 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 org.apache.hadoop.ipc.RemoteException: java.io.IOException:
 java.lang.RuntimeException: org.xml.sax.SAXParseException: Character 
 reference #1 is an invalid XML
 character.
 at
 org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:961)
 at
 org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:864)
 at
 org.apache.hadoop.conf.Configuration.getProps(Configuration.java:832)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:291)
 at
 org.apache.hadoop.mapred.JobConf.getJobPriority(JobConf.java:1163)
 at
 org.apache.hadoop.mapred.JobInProgress.init(JobInProgress.java:179)
 at
 org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1783)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
 at org.apache.hadoop.ipc.Client.call(Client.java:715)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
 at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
 at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
 at

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7725) fix test-patch so that Jenkins can accept patches to the hadoop-tools module.

2012-09-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7725.
-

   Resolution: Duplicate
Fix Version/s: (was: 0.24.0)

The hadoop-tools and others are all covered via HADOOP-8308 now. Marking as 
dupe of that.

Reopen if am incorrect.

 fix test-patch so that Jenkins can accept patches to the hadoop-tools module.
 -

 Key: HADOOP-7725
 URL: https://issues.apache.org/jira/browse/HADOOP-7725
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.0, 0.24.0
Reporter: Alejandro Abdelnur

 Basically, test-patch.sh needs some tinkering to recognize 
 hadoop-tools-project along-side common/mapreduce/hdfs. It also needs changes 
 to compile and run tests in hadoop-tools_projects on patch submission.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7855) Improve DiskChecker javadocs

2012-09-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7855.
-

   Resolution: Duplicate
Fix Version/s: (was: 0.24.0)

Dupe of HADOOP-7856 (Accidental dual-submit).

 Improve DiskChecker javadocs
 

 Key: HADOOP-7855
 URL: https://issues.apache.org/jira/browse/HADOOP-7855
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Reporter: Eli Collins
  Labels: noob

 The javadocs for DiskChecker#checkDir(File dir) trail off, look like they 
 weren't completed, should be. 
 While checkDir(File) uses java File to check if a dir actually is writable, 
 the version of checkDir that takes an FsPermission uses FsAction#implies 
 which doesn't actually check if a dir is writable (eg it passes on a 
 read-only file system). So switching from one version to the other can cause 
 unexpected bugs. Let's call this out explicitly in the javadocs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7884) test-patch seems to fail when a patch goes across projects (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.

2012-09-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7884.
-

   Resolution: Not A Problem
Fix Version/s: (was: 0.24.0)

No longer a problem after HADOOP-8308.

 test-patch seems to fail when a patch goes across projects 
 (common/hdfs/mapreduce) or touches hadoop-assemblies/hadoop-dist.
 

 Key: HADOOP-7884
 URL: https://issues.apache.org/jira/browse/HADOOP-7884
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
Reporter: Alejandro Abdelnur

 Take for example HDFS-2178, the patch applies cleanly, but test-patch fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7840) Cleanup unnecessary exceptions thrown and unnecessary casts

2012-09-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7840.
-

Resolution: Invalid

If we remove those throws IOException bits out of the methods you've changed, 
won't we inadvertently affect DistributedFileSystem that does Override these 
methods and makes use of the throws specification? We'll be breaking API if am 
not wrong.

For example, I tried removing throws IOException from FileSystem and 
DistributedFileSystem immediately complained at the overriden method cause it 
broke compatibility.

I'm closing this as Invalid, at this point, but please reopen if I got 
something wrong.

 Cleanup unnecessary exceptions thrown and unnecessary casts
 ---

 Key: HADOOP-7840
 URL: https://issues.apache.org/jira/browse/HADOOP-7840
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.24.0
Reporter: Hari Mankude
Assignee: Hari Mankude
Priority: Minor
 Attachments: hadoop-7840.trunk.patch


 Cleanup build warnings. It is the file in the hadoop-common subtree for 
 hdfs-2564.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8833) fs -text should make sure to call inputstream.seek(0) before using input stream

2012-09-20 Thread Harsh J (JIRA)

Harsh J created HADOOP-8833:
---

 Summary: fs -text should make sure to call inputstream.seek(0) 
before using input stream
 Key: HADOOP-8833
 URL: https://issues.apache.org/jira/browse/HADOOP-8833
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.0.2-alpha
Reporter: Harsh J
Assignee: Harsh J


From Muddy Dixon on HADOOP-8449:

Hi
We found the changes in order of switch and guard block in
{code}
private InputStream forMagic(Path p, FileSystem srcFs) throws IOException
{code}
Because of this change, return value of
{code}
codec.createInputStream(i)
{code}
is changed if codec exists.

{code}
private InputStream forMagic(Path p, FileSystem srcFs) throws IOException {
FSDataInputStream i = srcFs.open(p);

// check codecs
CompressionCodecFactory cf = new CompressionCodecFactory(getConf());
CompressionCodec codec = cf.getCodec(p);
if (codec != null) {
  return codec.createInputStream(i);
}

switch(i.readShort()) {
   // cases
}
{code}

New:

{code}
private InputStream forMagic(Path p, FileSystem srcFs) throws IOException {
FSDataInputStream i = srcFs.open(p);

switch(i.readShort()) { // === index (or pointer) processes!!
  // cases
  default: {
// Check the type of compression instead, depending on Codec class's
// own detection methods, based on the provided path.
CompressionCodecFactory cf = new CompressionCodecFactory(getConf());
CompressionCodec codec = cf.getCodec(p);
if (codec != null) {
  return codec.createInputStream(i);
}
break;
  }
}

// File is non-compressed, or not a file container we know.
i.seek(0);
return i;
  }
{code}

Fix is to use i.seek(0) before we use i anywhere. I missed that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7726) eclipse target does not build with 0.24.0

2012-07-22 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7726.
-

Resolution: Cannot Reproduce

(This has nothing to do with the eclipse-plugin)

No longer an issue on trunk/2.x branches. Please see 
http://wiki.apache.org/hadoop/HowToContribute to import sources into eclipse.

 eclipse target does not build with 0.24.0
 -

 Key: HADOOP-7726
 URL: https://issues.apache.org/jira/browse/HADOOP-7726
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 0.24.0
 Environment: Fedora 15
Reporter: Tim Broberg

 I'm new to hadoop, java, and eclipse, so please forgive me if I jumble 
 multiple issues together or mistake the symptoms of one problem for a 
 separate issue.
 Attempting to follow the build instructions from 
 http://wiki.apache.org/hadoop/EclipseEnvironment, the following commands are 
 to be executed:
   1 - mvn test -DskipTests
   2 - mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
   3 - cd hdfs; ant compile eclipse
   4 - cd ../; cd mapreduce; ant compile eclipse
 A - If mvn test -DskipTests is used for #1, #2 fails with [ERROR] Failed 
 to execute goal on project hadoop-yarn-common: Could not resolve dependencies 
 for project org.apache.hadoop:hadoop-yarn-common:jar:0.24.0-SNAPSHOT:
 Per Luke Lu's suggestion, mvn install -DskipTests -P-cbuild instead of step 
 #1 cleared up this issue.
 B - For steps #3, and #4 there are no hdfs or mapreduce subdirectories. These 
 appear to have been renamed hadoop-hdfs-project and 
 hadoop-mapreduce-project.
 C - For step #3, if I then go to hadoop-hdfs-project instead and perform ant 
 compile eclipse no build.xml file is found - Buildfile: build.xml does not 
 exist!
 D - For step #4, if I go to hadoop-mapreduce-project and do ant compile 
 eclipse a set of errors much like #A is produced:
  [ivy:resolve]   ::
  [ivy:resolve]   ::  UNRESOLVED DEPENDENCIES ::
  [ivy:resolve]   ::
  [ivy:resolve]   ::
  org.apache.hadoop#hadoop-yarn-server-common;0.24.0-SNAPSHOT: not found
  [ivy:resolve]   ::
  org.apache.hadoop#hadoop-mapreduce-client-core;0.24.0-SNAPSHOT: not found
  [ivy:resolve]   ::
  org.apache.hadoop#hadoop-yarn-common;0.24.0-SNAPSHOT: not found
  [ivy:resolve]   ::
 E - If I ignore these issues and import the projects generated in step #2, I 
 get a bunch of errors related to the lack of an M2_REPO definition. Adding 
 this variable needs to be included in the build scripts or documentation in 
 the wiki.
 F - Once that is resolved, eclipse shows hundreds of errors and warnings 
 starting with AvroRecord cannot be resolved to a type.
 Thanks so much for your work on this, but it needs a little more effort in 
 documentation and/or development before it is usable again.
 Thanks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-8362) Improve exception message when Configuration.set() is called with a null key or value

2012-07-15 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-8362:
-


Hi Suresh,

Looks like this wasn't committed? I'm going ahead and committing it in. 
Reopening for until it is done.

 Improve exception message when Configuration.set() is called with a null key 
 or value
 -

 Key: HADOOP-8362
 URL: https://issues.apache.org/jira/browse/HADOOP-8362
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: madhukara phatak
Priority: Trivial
  Labels: newbie
 Fix For: 3.0.0

 Attachments: HADOOP-8362-1.patch, HADOOP-8362-2.patch, 
 HADOOP-8362-3.patch, HADOOP-8362-4.patch, HADOOP-8362-5.patch, 
 HADOOP-8362-6.patch, HADOOP-8362-7.patch, HADOOP-8362-8.patch, 
 HADOOP-8362.9.patch, HADOOP-8362.patch


 Currently, calling Configuration.set(...) with a null value results in a 
 NullPointerException within Properties.setProperty. We should check for null 
 key/value and throw a better exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8362) Improve exception message when Configuration.set() is called with a null key or value

2012-07-15 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8362.
-

  Resolution: Fixed
   Fix Version/s: (was: 3.0.0)
  2.0.1-alpha
Target Version/s:   (was: 2.0.0-alpha)

Committed to branch-2 and trunk. Thanks Madhukara and Suresh!

 Improve exception message when Configuration.set() is called with a null key 
 or value
 -

 Key: HADOOP-8362
 URL: https://issues.apache.org/jira/browse/HADOOP-8362
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Todd Lipcon
Assignee: madhukara phatak
Priority: Trivial
  Labels: newbie
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-8362-1.patch, HADOOP-8362-2.patch, 
 HADOOP-8362-3.patch, HADOOP-8362-4.patch, HADOOP-8362-5.patch, 
 HADOOP-8362-6.patch, HADOOP-8362-7.patch, HADOOP-8362-8.patch, 
 HADOOP-8362.10.patch, HADOOP-8362.9.patch, HADOOP-8362.patch


 Currently, calling Configuration.set(...) with a null value results in a 
 NullPointerException within Properties.setProperty. We should check for null 
 key/value and throw a better exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8597) FsShell's Text command should be able to read avro data files

2012-07-14 Thread Harsh J (JIRA)

Harsh J created HADOOP-8597:
---

 Summary: FsShell's Text command should be able to read avro data 
files
 Key: HADOOP-8597
 URL: https://issues.apache.org/jira/browse/HADOOP-8597
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J


Similar to SequenceFiles are Apache Avro's DataFiles. Since these are getting 
popular as a data format, perhaps it would be useful if {{fs -text}} were to 
add some support for reading it, like it reads SequenceFiles. Should be easy 
since Avro is already a dependency and provides the required classes.

Of discussion is the output we ought to emit. Avro DataFiles aren't simple as 
text, nor have they the singular Key-Value pair structure of SequenceFiles. 
They usually contain a set of fields defined as a record, and the usual text 
emit, as available from avro-tools via 
http://avro.apache.org/docs/current/api/java/org/apache/avro/tool/DataFileReadTool.html,
 is in proper JSON format.

I think we should use the JSON format as the output, rather than a delimited 
form, for there are many complex structures in Avro and JSON is the easiest and 
least-work-to-do way to display it (Avro supports json dumping by itself).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8588) SerializationFactory shouldn't throw a NullPointerException if the serializations list is empty

2012-07-11 Thread Harsh J (JIRA)

Harsh J created HADOOP-8588:
---

 Summary: SerializationFactory shouldn't throw a 
NullPointerException if the serializations list is empty
 Key: HADOOP-8588
 URL: https://issues.apache.org/jira/browse/HADOOP-8588
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor


The SerializationFactory throws an NPE if 
CommonConfigurationKeys.IO_SERIALIZATIONS_KEY is set to an empty list in the 
config.

It should rather print a WARN log indicating the serializations list is empty, 
and start up without any valid serialization classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-5555) JobClient should provide an API to return the job names of jobs

2012-07-10 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-.
-

Resolution: Not A Problem

The JobClient provides both Job and RunningJob returns via some of its 
cluster-connecting methods, that in turn provide an API to retrieve the Job 
Name string already. Hence, this has already been fixed.

For the 'hadoop job -list' enhancement to show the same, see MAPREDUCE-4424 
instead (which I just forked out).

Resolving as Not a Problem (anymore).

 JobClient should provide an API to return the job names of jobs
 ---

 Key: HADOOP-
 URL: https://issues.apache.org/jira/browse/HADOOP-
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Runping Qi

 Currently, there seems to be no way to get the job name of a job from its job 
 id.
 The JobClient should provide a way to do so.
 Also, the command line hadoop job -list should also return the job names.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-6817) SequenceFile.Reader can't read gzip format compressed sequence file which produce by a mapreduce job without native compression library

2012-07-10 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-6817.
-

Resolution: Duplicate

This is being addressed via HADOOP-8582.

 SequenceFile.Reader can't read gzip format compressed sequence file which 
 produce by a mapreduce job without native compression library
 ---

 Key: HADOOP-6817
 URL: https://issues.apache.org/jira/browse/HADOOP-6817
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.20.2
 Environment: Cluster:CentOS 5,jdk1.6.0_20
 Client:Mac SnowLeopard,jdk1.6.0_20
Reporter: Wenjun Huang

 An hadoop job output a gzip compressed sequence file(whether record 
 compressed or block compressed).The client program use SequenceFile.Reader to 
 read this sequence file,when reading the client program shows the following 
 exceptions:
 2090 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load 
 native-hadoop library for your platform... using builtin-java classes where 
 applicable
 2091 [main] INFO org.apache.hadoop.io.compress.CodecPool - Got brand-new 
 decompressor
 Exception in thread main java.io.EOFException
   at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:207)
   at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:197)
   at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:136)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:58)
   at java.util.zip.GZIPInputStream.init(GZIPInputStream.java:68)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.init(GzipCodec.java:92)
   at 
 org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.init(GzipCodec.java:101)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:170)
   at 
 org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:180)
   at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1520)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1428)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1417)
   at 
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1412)
   at 
 com.shiningware.intelligenceonline.taobao.mapreduce.HtmlContentSeqOutputView.main(HtmlContentSeqOutputView.java:28)
 I studied the code in org.apache.hadoop.io.SequenceFile.Reader.init method 
 and read:
   // Initialize... *not* if this we are constructing a temporary Reader
   if (!tempReader) {
 valBuffer = new DataInputBuffer();
 if (decompress) {
   valDecompressor = CodecPool.getDecompressor(codec);
   valInFilter = codec.createInputStream(valBuffer, valDecompressor);
   valIn = new DataInputStream(valInFilter);
 } else {
   valIn = valBuffer;
 }
 the problem seems to be caused by valBuffer = new DataInputBuffer(); 
 ,because GzipCodec.createInputStream creates an instance of GzipInputStream 
 whose constructor creates an instance of ResetableGZIPInputStream class.When 
 ResetableGZIPInputStream's constructor calls it base class 
 java.util.zip.GZIPInputStream's constructor ,it trys to read the empty 
 valBuffer = new DataInputBuffer(); and get no content,so it throws an 
 EOFException.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8577) The RPC must have failed proxyUser (auth:SIMPLE) via realus...@hadoop.apache.org (auth:SIMPLE)

2012-07-08 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-8577.
-

Resolution: Invalid

The JIRA is to track issues with the project, not for user/dev-help. Please ask 
your question on common-dev[at]hadoop.apache.org mailing lists instead, and 
refrain from posting general questions on the JIRA. Thanks! :)

P.s. The issue is your OS. Fix your /etc/hosts to use the right format of IP 
FQDN ALIAS, instead of IP ALIAS FQDN. In any case, please mail the right 
user/dev group. See http://hadoop.apache.org/mailing_lists.html

 The RPC must have failed proxyUser (auth:SIMPLE) via 
 realus...@hadoop.apache.org (auth:SIMPLE)
 --

 Key: HADOOP-8577
 URL: https://issues.apache.org/jira/browse/HADOOP-8577
 Project: Hadoop Common
  Issue Type: Bug
  Components: test
 Environment: Ubuntu 11
 JDK 1.7
 Maven 3.0.4
Reporter: chandrashekhar Kotekar
Priority: Minor
   Original Estimate: 12h
  Remaining Estimate: 12h

 Hi,
 I have downloaded maven source code today itself and tried test it. I did 
 following steps :
 1) mvn clean
 2) mvn compile
 3) mvn test
 After 3rd step one step failed. Stack trace of failed test is as follows :
 Failed tests:   
 testRealUserIPNotSpecified(org.apache.hadoop.security.TestDoAsEffectiveUser): 
 The RPC must have failed proxyUser (auth:SIMPLE) via 
 realus...@hadoop.apache.org (auth:SIMPLE)
   testWithDirStringAndConf(org.apache.hadoop.fs.shell.TestPathData): checking 
 exist
   testPartialAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): 
 expected:myfs://host.a.b:123 but was:myfs://host.a:123
   testFullAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): 
 expected:null but was:java.lang.IllegalArgumentException: Wrong FS: 
 myfs://host/file, expected: myfs://host.a.b
   
 testShortAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://host.a.b:123 but was:myfs://host:123
   
 testPartialAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://host.a.b:123 but was:myfs://host.a:123
   testShortAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): 
 expected:myfs://host.a.b:123 but was:myfs://host:123
   
 testIpAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://127.0.0.1:456 but was:myfs://localhost:456
   
 testAuthorityFromDefaultFS(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://host.a.b:123 but was:myfs://host:123
   
 testFullAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:null but was:java.lang.IllegalArgumentException: Wrong FS: 
 myfs://host/file, expected: myfs://host.a.b:123
   
 testShortAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://host.a.b:456 but was:myfs://host:456
   
 testPartialAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://host.a.b:456 but was:myfs://host.a:456
   
 testFullAuthorityWithOtherPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:null but was:java.lang.IllegalArgumentException: Wrong FS: 
 myfs://host:456/file, expected: myfs://host.a.b:456
   testIpAuthority(org.apache.hadoop.fs.TestFileSystemCanonicalization): 
 expected:myfs://127.0.0.1:123 but was:myfs://localhost:123
   
 testIpAuthorityWithDefaultPort(org.apache.hadoop.fs.TestFileSystemCanonicalization):
  expected:myfs://127.0.0.1:123 but was:myfs://localhost:123
 Tests in error: 
   testUnqualifiedUriContents(org.apache.hadoop.fs.shell.TestPathData): `d1': 
 No such file or directory
 I am newbie in Hadoop source code world. Please help me in building hadoop 
 source code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8570) Bzip2Codec should accept .bz files too

2012-07-06 Thread Harsh J (JIRA)

Harsh J created HADOOP-8570:
---

 Summary: Bzip2Codec should accept .bz files too
 Key: HADOOP-8570
 URL: https://issues.apache.org/jira/browse/HADOOP-8570
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha, 1.0.0
Reporter: Harsh J


The default extension reported for Bzip2Codec today is .bz2. This causes it 
not to pick up .bz files as Bzip2Codec files. Although the extension is not 
very popular today, it is still mentioned as a valid extension in the bunzip 
manual and we should support it.

We should change the Bzip2Codec default extension to bz, or we should add in 
a new extension list support to allow for better detection across various 
aliases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3450) Add tests to Local Directory Allocator for asserting their URI-returning capability

2012-06-30 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-3450.
-

  Resolution: Fixed
   Fix Version/s: 2.0.1-alpha
Target Version/s:   (was: 2.0.1-alpha, 3.0.0)

Committed to trunk and branch-2. Thank you Sho!

 Add tests to Local Directory Allocator for asserting their URI-returning 
 capability
 ---

 Key: HADOOP-3450
 URL: https://issues.apache.org/jira/browse/HADOOP-3450
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs
Affects Versions: 0.17.0
Reporter: Ari Rabkin
Assignee: Sho Shimauchi
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha

 Attachments: HADOOP-3450.txt


 Original comment:
 {quote}Local directory allocator returns a bare path, without a URI 
 specifier.  This means that calling Path.getFileSystem will do the wrong 
 thing with the returned path.   Should really stick a file:// in front.
 Also it's test cases need to be improved to make sure this class works fine.
 {quote}
 Only the latter needed to be done (see below for discussion).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8531) SequenceFile Writer can throw out a better error if a serializer isn't available

2012-06-26 Thread Harsh J (JIRA)

Harsh J created HADOOP-8531:
---

 Summary: SequenceFile Writer can throw out a better error if a 
serializer isn't available
 Key: HADOOP-8531
 URL: https://issues.apache.org/jira/browse/HADOOP-8531
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Harsh J
Priority: Trivial


Currently, if the provided Key/Value class lacks a proper serializer in the 
loaded config for the SequenceFile.Writer, we get an NPE as the null return 
goes unchecked.

Hence we get:
{code}
java.lang.NullPointerException
at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1163)
at 
org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1079)
at 
org.apache.hadoop.io.SequenceFile$RecordCompressWriter.init(SequenceFile.java:1331)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:271)
{code}

We can provide a better message + exception in such cases. This is slightly 
related to MAPREDUCE-2584.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8532) [Configuration] Increase or make variable substitution depth configurable

2012-06-26 Thread Harsh J (JIRA)

Harsh J created HADOOP-8532:
---

 Summary: [Configuration] Increase or make variable substitution 
depth configurable
 Key: HADOOP-8532
 URL: https://issues.apache.org/jira/browse/HADOOP-8532
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0-alpha
Reporter: Harsh J


We've had some users recently complain that the default MAX_SUBST hardcode of 
20 isn't sufficient for their substitution needs and they wished it were 
configurable rather than having to roll about with workarounds such as using 
temporary smaller substitutes and then building the fuller one after it. We 
should consider raising the default hardcode, or provide a way to make it 
configurable instead.

Related: HIVE-2021 changed something similar for their HiveConf classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3421) Requirements for a Resource Manager for Hadoop

2012-06-16 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-3421.
-

Resolution: Duplicate

Resolving as dupe of MAPREDUCE-279. Although, this is much better doc-wise and
serves as a good reference.

Please reopen if I missed something that the other didn't provide (and was the
goal here).

Requirements for a Resource Manager for Hadoop
--

Key: HADOOP-3421
URL: https://issues.apache.org/jira/browse/HADOOP-3421
Project: Hadoop Common
Issue Type: New Feature
Reporter: Vivek Ratan

This is a proposal to extend the scheduling functionality of Hadoop to allow
sharing of large clusters without the use of HOD. We're suffering from
performance issues with HOD and not finding it the right model for running
jobs. We have concluded that a native Hadoop Resource Manager would be more
useful to many people if it supported the features we need for sharing
clusters across large groups and organizations.
Below are the key requirements for a Resource Manager for Hadoop. First, some
terminology used in this writeup:
* *RM*: Resource Manager. What we're building.
* *MR*: Map Reduce.
* A *job* is an MR job for now, but can be any request. Jobs are submitted by
users to the Grid. MR jobs are made up of units of computation called *tasks*.
* A grid has a variety of *resources* of different *capacities* that are
allocated to tasks. For the the early version of the grid, the only resource
considered is a Map or Reduce slot, which can execute a task. Each slot can
run one or more tasks. Later versions may look at resources such as local
temporary storage or CPUs.
* *V1*: version 1. Some features are simplified for V1.
h3. Orgs, queues, users, jobs
Organizations (*Orgs*) are distinct entities for administration,
configuration, billing and reporting purposes. *Users* belong to Orgs. Orgs
have *queues* of jobs, where a queue represents a collection of jobs that
share some scheduling criteria.
* *1.1.* For V1, each queue will belong to one Org and each Org will have
one queue.
* *1.2.* Jobs are submitted to queues. A single job can be submitted to
only one queue. It follows that a job will have a user and an Org associated
with it.
* *1.3.* A user can belong to multiple Orgs and can potentially submit
jobs to multiple queues.
* *1.4.* Orgs are guaranteed a fraction of the capacity of the grid (their
'guaranteed capacity') in the sense that a certain capacity of resources will
be at their disposal. All jobs submitted to the queues of an Org will have
access to the capacity guaranteed to the Org.
** Note: it is expected that the sum of the guaranteed capacity of each
Org should equal the resources in the Grid. If the sum is lower, some
resources will not be used. If the sum is higher, the RM cannot maintain
guarantees for all Orgs.
* *1.5.* At any given time, free resources can be allocated to any Org
beyond their guaranteed capacity. For example this may be in the proportion
of guaranteed capacities of various Orgs or some other way. However, these
excess allocated resources can be reclaimed and made available to another Org
in order to meet its capacity guarantee.
* *1.6.* N minutes after an org reclaims resources, it should have all its
reserved capacity available. Put another way, the system will guarantee that
excess resources taken from an Org will be restored to it within N minutes of
its need for them.
* *1.7.* Queues have access control. Queues can specify which users are
(not) allowed to submit jobs to it. A user's job submission will be rejected
if the user does not have access rights to the queue.
h3. Job capacity
* *2.1.* Users will just submit jobs to the Grid. They do not need to
specify the capacity required for their jobs (i.e. how many parallel tasks
the job needs). [Most MR jobs are elastic and do not require a fixed number
of parallel tasks to run - they can run with as little or as much task
parallelism as they can get. This amount of task parallelism is usually
limited by the number of mappers required (which is computed by the system
and not by the user) or the amount of free resources available in the grid.
In most cases, the user wants to just submit a job and let the system take
care of utilizing as many or as little resources as it can.]
h3. Priorities
* *3.1.* Jobs can optionally have priorities associated with them. For V1,
we support the same set of priorities available to MR jobs today.
* *3.2.* Queues can optionally support priorities for jobs. By default, a
queue does not support priorities, in which case it will ignore (with a
warning) any priority levels specified by jobs submitted to it. If

[jira] [Reopened] (HADOOP-3444) Implementing a Resource Manager (V1) for Hadoop

2012-06-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-3444:
-


 Implementing a Resource Manager (V1) for Hadoop
 ---

 Key: HADOOP-3444
 URL: https://issues.apache.org/jira/browse/HADOOP-3444
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Vivek Ratan
 Attachments: RMArch-V1.jpg


 HADOOP-3421 lists the requirements for a Resource Manager for Hadoop. This 
 Jira tracks its implementation. 
 It is expected that this Jira will be used to keep track of various other 
 Jiras that will be opened towards implementing Version 1 of the Resource 
 Manager. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3444) Implementing a Resource Manager (V1) for Hadoop

2012-06-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-3444.
-

Resolution: Fixed

MAPREDUCE-279 has covered this. Resolving as dupe, same as its parent.

 Implementing a Resource Manager (V1) for Hadoop
 ---

 Key: HADOOP-3444
 URL: https://issues.apache.org/jira/browse/HADOOP-3444
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Vivek Ratan
 Attachments: RMArch-V1.jpg


 HADOOP-3421 lists the requirements for a Resource Manager for Hadoop. This 
 Jira tracks its implementation. 
 It is expected that this Jira will be used to keep track of various other 
 Jiras that will be opened towards implementing Version 1 of the Resource 
 Manager. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3444) Implementing a Resource Manager (V1) for Hadoop

2012-06-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-3444.
-

Resolution: Duplicate

(Re-resolving as dupe)

 Implementing a Resource Manager (V1) for Hadoop
 ---

 Key: HADOOP-3444
 URL: https://issues.apache.org/jira/browse/HADOOP-3444
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Vivek Ratan
 Attachments: RMArch-V1.jpg


 HADOOP-3421 lists the requirements for a Resource Manager for Hadoop. This 
 Jira tracks its implementation. 
 It is expected that this Jira will be used to keep track of various other 
 Jiras that will be opened towards implementing Version 1 of the Resource 
 Manager. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8435) Propdel all svn:mergeinfo

2012-05-25 Thread Harsh J (JIRA)

Harsh J created HADOOP-8435:
---

 Summary: Propdel all svn:mergeinfo
 Key: HADOOP-8435
 URL: https://issues.apache.org/jira/browse/HADOOP-8435
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.0.0-alpha, 3.0.0
Reporter: Harsh J
Assignee: Harsh J


TortoiseSVN/some versions of svn have added several mergeinfo props to Hadoop's 
svn files/dirs (list below).

We should propdel that unneeded property, and fix it up. This otherwise causes 
pain to those who backport with a simple root-dir-down command (svn merge -c 
num url/path).

We should also make sure to update the HowToCommit page on advising to avoid 
mergeinfo additions to prevent this from reoccurring.

Files affected are, from my propdel revert output earlier today:
{code}
Reverted '.'
Reverted 'hadoop-hdfs-project'
Reverted 'hadoop-hdfs-project/hadoop-hdfs'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/main/java'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary'
Reverted 'hadoop-hdfs-project/hadoop-hdfs/src/main/native'
Reverted 'hadoop-mapreduce-project'
Reverted 'hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site'
Reverted 'hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt'
Reverted 'hadoop-mapreduce-project/conf'
Reverted 'hadoop-mapreduce-project/CHANGES.txt'
Reverted 'hadoop-mapreduce-project/src/test/mapred'
Reverted 'hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs'
Reverted 'hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs'
Reverted 'hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc'
Reverted 'hadoop-mapreduce-project/src/contrib'
Reverted 'hadoop-mapreduce-project/src/contrib/eclipse-plugin'
Reverted 'hadoop-mapreduce-project/src/contrib/block_forensics'
Reverted 'hadoop-mapreduce-project/src/contrib/index'
Reverted 'hadoop-mapreduce-project/src/contrib/data_join'
Reverted 'hadoop-mapreduce-project/src/contrib/build-contrib.xml'
Reverted 'hadoop-mapreduce-project/src/contrib/vaidya'
Reverted 'hadoop-mapreduce-project/src/contrib/build.xml'
Reverted 'hadoop-mapreduce-project/src/java'
Reverted 'hadoop-mapreduce-project/src/webapps/job'
Reverted 'hadoop-mapreduce-project/src/c++'
Reverted 'hadoop-mapreduce-project/src/examples'
Reverted 'hadoop-mapreduce-project/hadoop-mapreduce-examples'
Reverted 
'hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml'
Reverted 'hadoop-mapreduce-project/bin'
Reverted 'hadoop-common-project'
Reverted 'hadoop-common-project/hadoop-common'
Reverted 'hadoop-common-project/hadoop-common/src/test/core'
Reverted 'hadoop-common-project/hadoop-common/src/main/java'
Reverted 'hadoop-common-project/hadoop-common/src/main/docs'
Reverted 'hadoop-common-project/hadoop-auth'
Reverted 'hadoop-project'
Reverted 'hadoop-project/src/site'
{code}

Proposed set of fix (from 
http://stackoverflow.com/questions/767418/remove-unnecessary-svnmergeinfo-properties):
{code}
svn propdel svn:mergeinfo -R
svn revert .
svn commit -m appropriate message
{code}

(To be done on branch-2 and trunk both)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8395) Text shell command unnecessarily demands that a SequenceFile's key class be WritableComparable

2012-05-11 Thread Harsh J (JIRA)

Harsh J created HADOOP-8395:
---

 Summary: Text shell command unnecessarily demands that a 
SequenceFile's key class be WritableComparable
 Key: HADOOP-8395
 URL: https://issues.apache.org/jira/browse/HADOOP-8395
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.0.0
Reporter: Harsh J
Priority: Trivial


Text from Display set of Shell commands (hadoop fs -text), has a strict 
subclass check for a sequence-file-header loaded key class to be a subclass of 
WritableComparable.

The sequence file writer itself has no such checks (one can create sequence 
files with just plain writable keys, comparable is needed for sequence file's 
sorter alone, which not all of them use always), and hence its not reasonable 
for Text command to carry it either.

We should relax the check and simply just check for Writable, not 
WritableComparable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8359) Clear up javadoc warnings in hadoop-common-project

2012-05-04 Thread Harsh J (JIRA)

Harsh J created HADOOP-8359:
---

 Summary: Clear up javadoc warnings in hadoop-common-project
 Key: HADOOP-8359
 URL: https://issues.apache.org/jira/browse/HADOOP-8359
 Project: Hadoop Common
  Issue Type: Task
  Components: conf
Affects Versions: 2.0.0
Reporter: Harsh J
Priority: Trivial


Javadocs added in HADOOP-8172 has introduced two new javadoc warnings. Should 
be easy to fix these (just missing #s for method refs).

{code}
[WARNING] Javadoc Warnings
[WARNING] 
/Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java:334:
 warning - Tag @link: missing '#': addDeprecation(String key, String newKey)
[WARNING] 
/Users/harshchouraria/Work/code/apache/hadoop/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java:285:
 warning - Tag @link: missing '#': addDeprecation(String key, String newKey,
[WARNING] String customMessage)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-8323) Revert HADOOP-7940

2012-04-27 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-8323.
-

Resolution: Won't Fix
Target Version/s: (was: 3.0.0, 2.0.0)

Actually, I looked at the API and this would only impact the Text usage iff
clear() is called upon it. I do not think we should revert this. Clear must
work as intended - and clear the byte array states inside. There wouldn't be
any other way to free the memory if we didn't do this.

I do not see clear() being used in MR directly. So the largest length is still
maintained but thats not an issue (except that clear may be called for memory
gains if the user wants that).

I'm resolving this as Won't Fix (Won't revert). But if I've missed addressing
something, please reopen.

Revert HADOOP-7940
--

Key: HADOOP-8323
URL: https://issues.apache.org/jira/browse/HADOOP-8323
Project: Hadoop Common
Issue Type: Bug
Components: io
Affects Versions: 2.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Critical

Per [~jdonofrio]'s comments on HADOOP-7940, we should revert it as it has
caused a performance regression (for scenarios where Text is reused, popular
in MR).
The clear() works as intended, as the API also offers a current length API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HADOOP-8323) Revert HADOOP-7940

2012-04-27 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reopened HADOOP-8323:
-


 Revert HADOOP-7940
 --

 Key: HADOOP-8323
 URL: https://issues.apache.org/jira/browse/HADOOP-8323
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 2.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Critical

 Per [~jdonofrio]'s comments on HADOOP-7940, we should revert it as it has 
 caused a performance regression (for scenarios where Text is reused, popular 
 in MR).
 The clear() works as intended, as the API also offers a current length API.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3977) SequenceFile.Writer reopen (hdfs append)

2012-04-24 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-3977.
-

Resolution: Duplicate

A fresher effort is ongoing at HADOOP-7139

(Resolving as duplicate)

 SequenceFile.Writer reopen (hdfs append)
 

 Key: HADOOP-3977
 URL: https://issues.apache.org/jira/browse/HADOOP-3977
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Reporter: Karl Wettin
Assignee: Karl Wettin
Priority: Minor
 Attachments: HADOOP-3977.txt, HADOOP-3977.txt


 Allows for reopening and appending to a SequenceFile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8301) Common (hadoop-tools) side of MAPREDUCE-4172

2012-04-22 Thread Harsh J (JIRA)

Harsh J created HADOOP-8301:
---

 Summary: Common (hadoop-tools) side of MAPREDUCE-4172
 Key: HADOOP-8301
 URL: https://issues.apache.org/jira/browse/HADOOP-8301
 Project: Hadoop Common
  Issue Type: Task
  Components: build
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J


Patches on MAPREDUCE-4172 (for MR-relevant projects) that requires to run off 
of Hadoop Common project for Hadoop QA.

One sub-task per hadoop-tools submodule will be added here for reviews.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8302) Clean up hadoop-rumen

2012-04-22 Thread Harsh J (JIRA)

Harsh J created HADOOP-8302:
---

 Summary: Clean up hadoop-rumen
 Key: HADOOP-8302
 URL: https://issues.apache.org/jira/browse/HADOOP-8302
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


Clean up a bunch of existing javac warnings in hadoop-rumen module.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HADOOP-8303) Clean up hadoop-streaming

2012-04-22 Thread Harsh J (JIRA)

Harsh J created HADOOP-8303:
---

 Summary: Clean up hadoop-streaming
 Key: HADOOP-8303
 URL: https://issues.apache.org/jira/browse/HADOOP-8303
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: build
Affects Versions: 3.0.0
Reporter: Harsh J
Assignee: Harsh J
Priority: Minor


Clean up a bunch of existing javac warnings in hadoop-streaming module.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-7431) Test DiskChecker's functionality in identifying bad directories (Part 2 of testing DiskChecker)

2011-07-25 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-7431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-7431.
-

Resolution: Not A Problem

See my earlier comment. This was already covered.

 Test DiskChecker's functionality in identifying bad directories (Part 2 of 
 testing DiskChecker)
 ---

 Key: HADOOP-7431
 URL: https://issues.apache.org/jira/browse/HADOOP-7431
 Project: Hadoop Common
  Issue Type: Test
  Components: test, util
Affects Versions: 0.23.0
Reporter: Harsh J
Assignee: Harsh J
  Labels: test
 Fix For: 0.23.0


 Add a test for the DiskChecker#checkDir method used in other projects (HDFS).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-1922) The task output promotion exception handler should include the IOException in the diagnostic message

2011-07-17 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-1922.
-

Resolution: Fixed

This is now taken care of by OutputCommitter framework.

Although, at the time this was opened around, it was addressed by refactors at 
HADOOP-1874

 The task output promotion exception handler should include the IOException in 
 the diagnostic message
 

 Key: HADOOP-1922
 URL: https://issues.apache.org/jira/browse/HADOOP-1922
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Owen O'Malley
Assignee: Devaraj Das

 When the JobTracker fails to promote output, it should have a more detailed 
 error message that includes the exception that was thrown by the FileSystem 
 operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-1769) Possible StackOverflowError in FileSystem.get(Uri uri, Configuration conf) method

2011-07-17 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-1769.
-

Resolution: Cannot Reproduce

Doesn't look like its a problem anymore.

Here's a simple test to guarantee (there may already exist something like this, 
but I did not look)

{code}

  public void testStack() throws IOException, URISyntaxException {
Configuration conf = new Configuration();
String url = /;
URI uri = new URI(url);
assertEquals(null, uri.getScheme());
FileSystem fs = FileSystem.get(uri, conf);
  }

{code}

Marking as 'Cannot Reproduce' (now).

 Possible StackOverflowError in FileSystem.get(Uri uri, Configuration conf) 
 method
 -

 Key: HADOOP-1769
 URL: https://issues.apache.org/jira/browse/HADOOP-1769
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.14.0
Reporter: Thomas Friol

 When calling the method Filesytem.get(Uri uri, Configuration conf) with an 
 URI without scheme - StackOverflowError
 {noformat}
 Exception in thread Main Thread java.lang.StackOverflowError:
 at java.util.regex.Matcher.init(Matcher.java:201)
 at java.util.regex.Pattern.matcher(Pattern.java:879)
 at 
 org.apache.hadoop.conf.Configuration.substituteVars(Configuration.java:182)
 at org.apache.hadoop.conf.Configuration.get(Configuration.java:247)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:143)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-2221) Configuration.toString is broken

2011-07-17 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-2221.
-

Resolution: Not A Problem

Nicholas - Yep, looks invalid given current state (0.23/trunk) of 
Configuration. Resources and Default Resources are now loaded during toString 
ops.

{code}
  @Override
  public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(Configuration: );
if(loadDefaults) {
  toString(defaultResources, sb);
  if(resources.size()0) {
sb.append(, );
  }
}
toString(resources, sb);
return sb.toString();
  }
{code}

Closing as Not-a-problem (anymore).

 Configuration.toString is broken
 

 Key: HADOOP-2221
 URL: https://issues.apache.org/jira/browse/HADOOP-2221
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.15.0
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: HADOOP-2221_1_2007117.patch


 {{Configuration.toString}} doesn't string-ify the {{Configuration.resources}} 
 field which was added in HADOOP-785.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-3291) Add StackWritable and QueueWritable classes

2011-07-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-3291.
-

Resolution: Not A Problem

I feel these are better suited as 3rd party packages, or under projects like 
Mahout where they may be utilized as a utility class.

However, feel free to reopen if you feel they add good value to Hadoop common 
itself.

 Add StackWritable and QueueWritable classes
 ---

 Key: HADOOP-3291
 URL: https://issues.apache.org/jira/browse/HADOOP-3291
 Project: Hadoop Common
  Issue Type: New Feature
  Components: io
 Environment: All
Reporter: Dennis Kubes
Assignee: Dennis Kubes
 Attachments: HADOOP-3291-1-20080421.patch


 Adds Writable classes for FIFO Queu and LIFO Stack data structures.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-6342) Create a script to squash a common, hdfs, and mapreduce tarball into a single hadoop tarball

2011-07-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-6342.
-

   Resolution: Not A Problem
Fix Version/s: (was: 0.21.1)
   0.22.0

Looks like this was not marked as resolved post Tom's comment earlier, of 
https://issues.apache.org/jira/browse/HADOOP-6846 having fixed it in 0.22

 Create a script to squash a common, hdfs, and mapreduce tarball into a single 
 hadoop tarball
 

 Key: HADOOP-6342
 URL: https://issues.apache.org/jira/browse/HADOOP-6342
 Project: Hadoop Common
  Issue Type: New Feature
  Components: build
Reporter: Owen O'Malley
Assignee: Owen O'Malley
Priority: Blocker
 Fix For: 0.22.0

 Attachments: HADOOP-6342.2.patch, HADOOP-6342.patch, h-6342.patch, 
 tar-munge, tar-munge


 It would be convenient for the transition if we had a script to take a set of 
 common, hdfs, and mapreduce tarballs and merge them into a single tarball. 
 This is intended just to help users who don't want to transition to split 
 projects for deployment immediately.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-516) Eclipse-based GUI: DFS explorer and basic Map/Reduce job launcher

2011-07-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-516.


Resolution: Fixed

The plugin seems to have integrated this feature already. Should this still be 
open, please reopen it.

 Eclipse-based GUI: DFS explorer and basic Map/Reduce job launcher
 -

 Key: HADOOP-516
 URL: https://issues.apache.org/jira/browse/HADOOP-516
 Project: Hadoop Common
  Issue Type: New Feature
 Environment: Eclipse 3.2
 JDK 1.5
Reporter: Frédéric Bertin
 Attachments: hdfsExplorer.zip, hdfsExplorer2.zip


 to increase productivity in our current project (which makes a heavy use of 
 Hadoop), we wrote a small Eclipse-based GUI application which basically 
 consists in 2 views:
*  a HDFS explorer adapted from Eclipse filesystem explorer example.
  For now, it includes the following features:
  o classical tree-based browsing interface, with directory
content being detailed in a 3 columns table (file name, file
size, file type)
  o refresh button
  o delete file or directory (with confirm dialog): select files
in the tree or table and click the Delete button
  o rename file or directory: simple click on the file in the
table, type the new name and validate
  o open file with system editor: select the file in the table
and click Open button (works on Windows, not on Linux)
  o internal drag  drop
  o external drag  drop from the local filesystem to the HDFS
(the opposite doesn't work)
* a MapReduce *very* simple job launcher:
  o select the job XML configuration file
  o run the job
  o kill the job
  o visualize map and reduce progress with progress bars
  o open a browser on the Hadoop job tracker web interface 
 INSTALLATION NOTES:
  - Eclipse 3.2
  - JDK 1.5
  - import the archive in Eclipse
  - copy your hadoop conf file (hadoop-default.xml in src folder) - this 
 step should be moved in the GUI later
  - right-click on the project and Run As - Eclipse Application
  - enjoy...

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-323) IO Exception at LocalFileSystem.renameRaw, when running Nutch nightly builds (0.8-dev).

2011-07-16 Thread Harsh J (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HADOOP-323.


Resolution: Invalid

Was fixed long time ago, but wasn't closed. Doesn't apply today.

I run LJRunner and it never really complains in any run about things like these.

Closing as Invalid (now).

 IO Exception at LocalFileSystem.renameRaw, when running Nutch nightly builds 
 (0.8-dev).
 ---

 Key: HADOOP-323
 URL: https://issues.apache.org/jira/browse/HADOOP-323
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 0.3.2
 Environment: Windows XP + CygWin
Reporter: KuroSaka TeruHiko

 IO Exception at LocalFileSystem.renameRaw, when running Nutch nightly builds 
 (0.8-dev).
 Please see the deatil descriptions in:
 http://issues.apache.org/jira/browse/NUTCH-266
 Not knowing how to reclassify an existing bug, I am opening this new bug 
 under Hadoop.
 The version number is 0.3.3 but because I don't see it in the jira list, I 
 chose the closest matching version.  The Nutch-with-GUI build was running 
 with hadoop-0.2 but stopped running, exhibiting the same symptom with other 
 nightly builds, when switched to use hadoop-0.3.3.
 I checked fs as component but this bug could also be caused by the order in 
 which jobs are scheduled, I suspect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HADOOP-466) Startup scripts will not start instances of Hadoop daemons w/different configs w/o setting separate PID directories

2011-07-16 Thread Harsh J (JIRA)

[
https://issues.apache.org/jira/browse/HADOOP-466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Harsh J resolved HADOOP-466.

Resolution: Fixed
Fix Version/s: 0.20.0

This problem indeed exists if one doesn't use the HADOOP_IDENT_STRING, but
that's a valid workaround than adding a dependency on md5sum and the like (or
do we already use it?).

I think this may be resolved as fixed with the availability of
HADOOP_IDENT_STRING to workaround with.

Workaround (tested to work in 0.20.2):

{code}
# To start a second DN on same machine, with separated config.
HADOOP_IDENT_STRING=$USER-DN2 hadoop-daemon.sh --config /conf/dn2 start datanode
HADOOP_IDENT_STRING=$USER-DN2 hadoop-daemon.sh --config /conf/dn2 stop datanode
# These manage the PIDs as well, and will not complain that stuff is already
running.
{code}

Startup scripts will not start instances of Hadoop daemons w/different
configs w/o setting separate PID directories
---

Key: HADOOP-466
URL: https://issues.apache.org/jira/browse/HADOOP-466
Project: Hadoop Common
Issue Type: Improvement
Components: conf
Affects Versions: 0.5.0
Reporter: Vetle Roeim
Fix For: 0.20.0

Attachments: hadoop-466.diff

Configuration directories can be specified by either setting HADOOP_CONF_DIR
or using the --config command line option. However, the hadoop-daemon.sh
script will not start the daemons unless the PID directory is separate for
each configuration.
The issue is that the code for generating PID filenames is not dependent on
the configuration directory. While the PID directory can be changed in
hadoop-env.sh, it seems a little unnecessary to have this restriction.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 >

1 - 100 of 126 matches

Mail list logo