Re: Hadoop2 LZO issue with pig

2014-03-10 Thread Bing Jiang
you need to register necessary jar which contains com.twitter.elephantbird.mapreduce.input.MultiInputFormat.class ... 2014-03-08 18:00 GMT+08:00 Viswanathan J jayamviswanat...@gmail.com: Hi, Getting this issue in hadoop 2.x with pig, java.lang.Exception: java.lang.RuntimeException:

Hadoop 2.2.0 not showing progress

2014-03-10 Thread Silvina Caíno Lores
Hi all, I've been noticing lately that sometimes my Hadoop jobs do not report progress in the terminal. They seem like they are stuck at the Running job: job_ message, however YarnChilds are running and properly executing. I know that my job didn't fail but it's very inconvenient not being

Hadoop Client on OSGi

2014-03-10 Thread Geoffry Roberts
All, I would like to gauge the what the level of interest might be in having a version of the Hadoop client running as an OSGi bundle. I for one would like to use such a thing. There have been inquiries as to putting the whole Hadoop shooting match in OSGi. I'm talking here about just the

Re: Hadoop Client on OSGi

2014-03-10 Thread jb
Hi Geoffry, FYI, I just released a ServiceMix Bundles for hadoop-client 2.3.0 (I closed the SMX bundles release vote this morning, the artifact will be on Central today). On the other hand, I have a branch for Hadoop 2.3.0 (it's fork on my machine, I will push on my github later today).

Re: regarding hadoop source code

2014-03-10 Thread Avinash Kujur
hi, i downloaded the code from https://github.com/apache/hadoop-common.git . but while executing the command mvn install -DskipTests its giving this error in between: [INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ hadoop-hdfs-httpfs --- [INFO] Compiling 56 source files to

Re: Hadoop Client on OSGi

2014-03-10 Thread Geoffry Roberts
JB, Thanks, I'll look forward to your push. Something I've noticed, is the hadoop client has any number dependencies that are already osgi-ified. Some of these would be useful to have deployed apart from hadoop so they can be used by other bundles. It seems a shame to have them deployed twice.

Re: Hadoop Client on OSGi

2014-03-10 Thread jb
You can take a look on the SMX bundle that I did: http://svn.apache.org/repos/asf/servicemix/smx4/bundles/trunk/hadoop-client-2.3.0/pom.xml You will see the bundle embedded dependencies, or the import package to other bundles. Regards JB On 2014-03-10 16:00, Geoffry Roberts wrote: JB,

Re: Hadoop 2.2.0 not showing progress

2014-03-10 Thread Xuan Gong
Hey, Silvina: You may find more information about this application from RM web UI or Yarn Command Line (Type yarn application -help to find out commands). Thanks Xuan Gong On Mon, Mar 10, 2014 at 4:16 AM, Silvina Caíno Lores silvi.ca...@gmail.comwrote: Hi all, I've been noticing lately

AUTO: Jose Luis Mujeriego Gomez is out of the office. (returning 13/03/2014)

2014-03-10 Thread Jose Luis Mujeriego Gomez1
I am out of the office until 13/03/2014. I will be out of the office with limited access to my email. I will try to answer your email as soon as I can. Expect delays on my answers. For any urgent matter please contact Dittmar Haegele (dittmar.haeg...@de.ibm.com) or Tadhg Murphy

Possible issues with Endian in Checksum Algorithms?

2014-03-10 Thread Sebastian Höhn
Hello, I installed Hadoop via BigTop on my IBM Power machines and hdfs keeps giving me checksum errors. I checked the name nodes, datanodes and everything is up and running. The web interface via port 50070 can even display the file. The command line tools always throw the checksum exception.

User queues on Fair and Capacity scheduler

2014-03-10 Thread Ashwin Sai Shankar
Hi, We have a use case on our clusters where we want users to be in their own (leaf) queues ie all jobs in a leaf queue belong to a particular user so that : 1. We can assign capacity(Cap Sch)/weight(Fair Sch) to individual users. 2. An app in a user-queue can preempt another user from other

Can a YARN Cient or Application Master determine when log aggregation has completed?

2014-03-10 Thread Geoff Thompson
Hello, Log aggregation is great. However, if a yarn application runs a large number of tasks which generate large logs, it takes some finite amount of time for all of the logs to be collected and written to the HDFS. Currently our client code runs the equivalent of the yarn logs command once

Re: Can a YARN Cient or Application Master determine when log aggregation has completed?

2014-03-10 Thread Zhijie Shen
Hi Geoff, Unfortunately, there's no such a API for users to determine whether the log aggregation is completed or not, but the issue has been tackled. You can keep an eye on YARN-1279. - Zhijie On Mon, Mar 10, 2014 at 10:18 AM, Geoff Thompson ge...@bearpeak.com wrote: Hello, Log

How do I programmatically run a Hadoop 2.0 job from a Hadoop Client outside the cluster

2014-03-10 Thread Steve Lewis
Under Hadoop 0.2 I was able to run a Hadoop from an external machine (say a windows box with Cygwin) running on the same network as the cluster by setting fs.default.name in my Java code on the client machine and little else in the config file With 2.0 I want to do something similar launching a

Re: Can a YARN Cient or Application Master determine when log aggregation has completed?

2014-03-10 Thread Geoff Thompson
Hi Zhijie, Thanks for letting us know this issue has been recognized. Thanks, Geoff On Mar 10, 2014, at 12:09 PM, Zhijie Shen zs...@hortonworks.com wrote: Hi Geoff, Unfortunately, there's no such a API for users to determine whether the log aggregation is completed or not, but the

Adding new dataNode to existing cluster for hadoop2.2

2014-03-10 Thread Parmeet
Hello,   I am trying to add a new dataNode to existing hadoop cluster you would like to know exact steps to refresh NameNode for new node list.   I am using hadoop2.2 release.   thanks, -R

Re: GC overhead limit exceeded

2014-03-10 Thread haihong lu
i have tried both of the methods you side, but the problem still exists. Thanks all the same. by the way, my hadoop version is 2.2.0, so the parameter mapreduce.map.memory.mb =3072 added to mapred-site.xml maybe has no effect. I have looked for this parameter in the document of hadoop, but did

Re: regarding hadoop source code

2014-03-10 Thread Oleg Zhurakousky
You must be using Java 1.5 or below where @Override is not allowed on any method that implements its counterpart from interface. Remember, both 1.5 and 1.6 are EOL, so I would suggest upgrading to 1.7. Oleg On Mon, Mar 10, 2014 at 10:49 AM, Avinash Kujur avin...@gmail.com wrote: hi, i

Re: GC overhead limit exceeded

2014-03-10 Thread unmesha sreeveni
Try to increase the memory for datanode and see.This need to restart hadoop export HADOOP_DATANODE_OPTS=-Xmx10g This will set the heap to 10gb You can also add this in start of hadoop-env.sh file On Tue, Mar 11, 2014 at 9:02 AM, haihong lu ung3...@gmail.com wrote: i have tried both of the

Re: Hadoop 2.2.0 not showing progress

2014-03-10 Thread sudhakara st
RM not able schdule your jobs it waiting indefinitely to schedule jobs, may due to not able communicate with NM's or not able create AM or sufiicient resoucrce are not avaialbel in containers etc. Check your configuration for Java heap, yarn.app.mapreduce.am.resource.mb,

Re: Adding new dataNode to existing cluster for hadoop2.2

2014-03-10 Thread Stanley Shi
just start the new node with the same configuration as in namenode, after sometime, you will see the new node list. Regards, *Stanley Shi,* On Tue, Mar 11, 2014 at 9:07 AM, Parmeet delhi...@yahoo.com wrote: Hello, I am trying to add a new dataNode to existing hadoop cluster you would