Reading files from hdfs directory

2014-03-13 Thread Satyam Singh
Hello, I want to read files from hdfs remotely through camel-hdfs client. I have made changes in camel-hdfs component for supporting hadoop2.2.0 . I checked file that I want to read, exists on hdfs: [hduser@bl460cx2425 ~]$ hadoop fs -ls /user/hduser/collector/test.txt 14/03/13 09:13:31 WARN

RE: Reading files from hdfs directory

2014-03-13 Thread Vinayakumar B
Hi Satyam, Check whether your Camel client-side configurations are pointing to correct NameNode(s). What is the deployment ? whether HA/Non-HA? And check whether same exception is present in (Active) NameNode logs. If not then request is going to some other NameNode. Regards, Vinayakumar B.

verbose output

2014-03-13 Thread Mahmood Naderan
Hi, Is there any verbosity flag for hadoop and mahout commands? I can not find such thing in the command line.   Regards, Mahmood

Re: Use Cases for Structured Data

2014-03-13 Thread Dieter De Witte
Sandbox is just meant to be a learning environment i guess, to see what's possible, how things can be connected. The real distribution will have much higher performance and is the one you need when you want to investigate performance issues. The only real drawback of the real distributions is that

Re: verbose output

2014-03-13 Thread Mahmood Naderan
The hadoop-2.3.0/log is empty when I run mahout command which uses hadoop   Regards, Mahmood On Thursday, March 13, 2014 12:53 PM, Sebastian Schelter s...@apache.org wrote: To my knowledge, there is no such flag for mahout. You can check hadoop's logs for further information however. On

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan
Strange thing is that if I use either -Xmx128m of -Xmx16384m the process stops at the chunk #571 (571*64=36.5GB). Still I haven't figured out is this a problem with JVM or Hadoop or Mahout? I have tested various parameters on 16GB RAM property namemapred.map.child.java.opts/name

Streaming a subset of HBase data

2014-03-13 Thread Ian Brooks
Hi, I'm trying to implement a way of using the hadoop-streaming-2.2.0.jar to export a subset of data ( timerange ) to a mapper and reduce application written in another language. However I have been unable to get anything but all the data from HBase table. Looking at the code and forums, it

Re: Use Cases for Structured Data

2014-03-13 Thread ados1...@gmail.com
okies, thank you D, i will start playing around with the Sandbox version. On Thu, Mar 13, 2014 at 5:55 AM, Dieter De Witte drdwi...@gmail.com wrote: Sandbox is just meant to be a learning environment i guess, to see what's possible, how things can be connected. The real distribution will

Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Hello Team, I am initiating an POC to see value of having hadoop in our architecture and so after discussing my current scenario with experts here, i think it would be better for me to start using sandbox version rather then using actual distribution from POC point of view. My query here is how

RE: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread Martin, Nick
Hi Andy, Generally speaking, the folks participating on this list avoid questions of distribution preference. There are, perhaps obviously, both minor and significant differences in distributions that you should research and evaluate to find the best fit for your organization's strategy.

Re: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Thank you Martin. I will make sure that I do not have vendor specific question on this forum. But since am starting out with Hadoop, I wanted to learn about what are the keys things that we have to keep in mind while deciding on which distribution to take...open source hadoop, mapr m7,

Hbase create table error

2014-03-13 Thread Manish
Hi All, Below is the error details that i am getting when creating tables in Hbase. All the services are running fine. hbase(main):001:0 create 't1', 'cf1' *ERROR: java.lang.NoClassDefFoundError: org/apache/hadoop/security/authentication/util/KerberosName* Here is some help for this

RE: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread Martin, Nick
Start here http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support The list of things you might consider before picking a distribution is quite likely limited only by one's imagination. So, start with the basics like hosted vs. in-house, what your use case(s) cover, etc.

Re: Solving heap size error

2014-03-13 Thread Mahmood Naderan
I am pretty sure that there is something wrong with hadoop/mahout/java. With any configuration, it stuck at the chunk #571. Previous chunks are created rapidly but I see it waits for bout 30 minutes on 571 and that is the reason for heap error size. I will try to submit a bug report.  

Pig with Tez

2014-03-13 Thread Viswanathan J
Hi, Is that apache pig will run with tez?

Re: Hortonworks HDP 2 sandbox or Cloudera CDH Distribution

2014-03-13 Thread ados1...@gmail.com
Thanks Nick, appreciate your inputs on this. On Thu, Mar 13, 2014 at 12:51 PM, Martin, Nick nimar...@pssd.com wrote: Start here http://wiki.apache.org/hadoop/Distributions%20and%20Commercial%20Support The list of things you might consider before picking a distribution is quite likely

Re: Pig with Tez

2014-03-13 Thread Kim Chew
Google is your friend, http://hortonworks.com/hadoop/tez/ Kim On Thu, Mar 13, 2014 at 12:16 PM, Viswanathan J jayamviswanat...@gmail.comwrote: Hi, Is that apache pig will run with tez?

ResourceManager shutting down

2014-03-13 Thread John Lilley
We have this erratic behavior where every so often the RM will shutdown with an UnknownHostException. The odd thing is, the host it complains about have been in use for days at that point without problem. Any ideas? Thanks, John 2014-03-13 14:38:14,746 INFO rmapp.RMAppImpl

Reg: Setting up Hadoop Cluster

2014-03-13 Thread ados1...@gmail.com
Hello Team, I have one question regarding putting data into hdfs and running mapreduce on data present in hdfs. 1. hdfs is file system and so to interact with it what kind of clients are available? also where do we need to install those client? 2. regarding pig, hive and mapreduce,

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread Geoffry Roberts
Andy, Once you have hadoop running, You can run your jobs from the cli of the name node. When I write a map reduce job, I jar it up. and place it in, say, my home directory and run it from there. I do the same with pig scripts. I've used neither hive nor cascading, but I imagine they would

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread ados1...@gmail.com
Thank you Geoffry, I have some fundamental question here. 1. Once I have installed Hadoop, how can i identify which nodes is master node, which is slave? 2. My understanding is that master node is by default namenode and slave node are data nodes, correct? 3. So i installed hadoop

RE: ResourceManager shutting down

2014-03-13 Thread John Lilley
Never mind... we figured out its DNS entry was going missing. john From: John Lilley [mailto:john.lil...@redpoint.net] Sent: Thursday, March 13, 2014 2:52 PM To: user@hadoop.apache.org Subject: ResourceManager shutting down We have this erratic behavior where every so often the RM will shutdown

Re: Reg: Setting up Hadoop Cluster

2014-03-13 Thread Geoffry Roberts
Did you not populate the slaves file when you did your installation? In older versions of hadoop ( 2.0), there was a master file where you entered your name node. Now days there are multiple name nodes. I haven't worked with them as of yet. I installed pig, for example, on my name node and

Apache Tez supporting pig version

2014-03-13 Thread Viswanathan J
Hi, Which pig version supports the Apache Tez? Pig 0.12 version will support the Tez? Or v0.14 yet to release. Pls help.

Re: Hadoop2.x reading data

2014-03-13 Thread Viswanathan J
Thanks Harsh. On Mar 11, 2014 11:19 PM, Harsh J ha...@cloudera.com wrote: This is a Pig problem, not a Hadoop 2.x one - can you please ask it at u...@pig.apache.org? You may have to subscribe to it first. On Tue, Mar 11, 2014 at 1:03 PM, Viswanathan J jayamviswanat...@gmail.com wrote: Hi,

RE: NodeManager health Question

2014-03-13 Thread Rohith Sharma K S
Hi , As troubleshooting, few things you can verify 1. check RM web UI for Is there any 'Active Nodes' in Yarn cluster?. http:// yarn.resourcemanager.webapp.address/cluster. And also verify for Lost Nodes or Unhealthy Nodes or Rebooted Nodes. If there any active nodes,

Re: ResourceManager shutting down

2014-03-13 Thread Hitesh Shah
Hi John Would you mind filing a jira with more details. The RM going down just because a host was not resolvable or DNS timed out is something that should be addressed. thanks -- Hitesh On Mar 13, 2014, at 2:29 PM, John Lilley wrote: Never mind… we figured out its DNS entry was going

Re: ResourceManager shutting down

2014-03-13 Thread Jian He
Which Hadoop version are you running ? this should be recently fixed. Jian On Thu, Mar 13, 2014 at 8:33 PM, Hitesh Shah hit...@apache.org wrote: Hi John Would you mind filing a jira with more details. The RM going down just because a host was not resolvable or DNS timed out is something

RE: ResourceManager shutting down

2014-03-13 Thread Rohith Sharma K S
Hi Hitesh, Yes it is an issue. This is handled in https://issues.apache.org/jira/i#browse/YARN-713 fixes DNS Issue. This fix available on hadoop-2.4(unreleased). Thanks Regards Rohith Sharma K S -Original Message- From: Hitesh Shah [mailto:hit...@apache.org] Sent: 14

Difference between FILE_Bytes_READ vs HDFS_Bytes_Read.

2014-03-13 Thread Sai Sai
Can some please help: 1. Difference between FILE_Bytes_READ vs HDFS_Bytes_Read. Thanks Sai

Is hdinsights a C# version of hadoop or is it in java.

2014-03-13 Thread Sai Sai
Is hdinsights a C# version of hadoop or is it in java. Please let me know. Thanks Sai

Re: Is hdinsights a C# version of hadoop or is it in java.

2014-03-13 Thread Marco Shaw
It is based on Java (uses Hortonworks), however, Microsoft provides a .NET SDK: http://hadoopsdk.codeplex.com Marco On Mar 14, 2014, at 2:32 AM, Sai Sai saigr...@yahoo.in wrote: Is hdinsights a C# version of hadoop or is it in java. Please let me know. Thanks Sai