Oozie dbconnection taking old dns entry

2013-08-23 Thread hadoop hive
Hi Folks, i am getting an error while starting oozie, ERROR: Oozie could not be started REASON: org.apache.oozie.service.ServiceException: E0103: Could not load service classes, Cannot create PoolableConnectionFactory (null, message from server: Host 'Abc-new.corp.apple.com' is not allowed to

Re: HOW WEBHDFS WORKS

2013-08-23 Thread Visioner Sadak
Any hints friends On Thu, Aug 22, 2013 at 9:59 PM, Visioner Sadak visioner.sa...@gmail.comwrote: friends does anyone know how webhdfs internally works or how it uses jetty server within hado0p

Re: question about fair scheduler

2013-08-23 Thread Sandy Ryza
That's right that the other 2 apps will end up getting 10 resources each, but as more resources become released, eventually the cluster will converge to a fair state. I.e. if the first app requested additional resources after releasing resources, it would not receive any more until either another

DICOM Image Processing using Hadoop

2013-08-23 Thread Shalish VJ
Hi,   Is it possible to process DICOM images using hadoop. Please help me with an example.   Thanks, Shalish.

Re: DICOM Image Processing using Hadoop

2013-08-23 Thread kapil bhosale
Hi, there is one api Called HIPI( http://hipi.cs.virginia.edu/ ), for processing Huge Images using Hadoop. You can get some more information there. It might work, in your case. if not please ignore. Thanks and regards Kapil On Fri, Aug 23, 2013 at 3:01 PM, Shalish VJ shalis...@yahoo.com wrote:

Re: DICOM Image Processing using Hadoop

2013-08-23 Thread Shalish VJ
Hi,   I am trying to do a proof of  concept on that. Please help if  anyone has some idea. I couldnt find any from the internet. From: haiwei.xie-soulinfo haiwei@soulinfo.com To: user@hadoop.apache.org Sent: Friday, August 23, 2013 3:16 PM Subject: Re: DICOM

Re: running map tasks in remote node

2013-08-23 Thread rab ra
Thanks for the reply. I am basically exploring possible ways to work with hadoop framework for one of my use case. I have my limitations in using hdfs but agree with the fact that using map reduce in conjunction with hdfs makes sense. I successfully tested wholeFileInputFormat by some googling.

RE: Hadoop - impersonation doubts/issues while accessing from remote machine

2013-08-23 Thread Omkar Joshi
Thanks :) Regards, Omkar Joshi -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Friday, August 23, 2013 3:52 PM To: user@hadoop.apache.org Subject: Re: Hadoop - impersonation doubts/issues while accessing from remote machine I've answered this on the stackoverflow

Re: running map tasks in remote node

2013-08-23 Thread Shahab Yunus
You say: Each map process gets a line. The map process will then do a file transfer and process it. What file, from where to where is being transferred in the map? Are you sure that the mappers are not complaining about 'this' file access? Because this seem to be separate from the initial data

RE: running map tasks in remote node

2013-08-23 Thread java8964 java8964
It is possible to do what you are trying to do, but only make sense if your MR job is very CPU intensive, and you want to use the CPU resource in your cluster, instead of the IO. You may want to do some research about what is the HDFS's role in Hadoop. First but not least, it provides a central

Partitioner vs GroupComparator

2013-08-23 Thread Eugene Morozov
Hello, I have two different types of keys emerged from Map and processed by Reduce. These keys have some part in common. And I'd like to have similar keys in one reducer. For that purpose I used Partitioner and partition everything gets in by this common part. It seems to be fine, but MRUnit

Need Help

2013-08-23 Thread Manish Kumar
Hi All, I am new to Hadoop technology. I had used it only once for my BE project to create Weblog Analyzer. Entire cluster was made up of 7 nodes.I am eager to know more about this technology and want to build my career in this. But I am not able to make out 1. How I can shape up myself to

Re: Partitioner vs GroupComparator

2013-08-23 Thread Harsh J
The partitioner runs on the map-end. It assigns a partition ID (reducer ID) to each key. The grouping comparator runs on the reduce-end. It helps reducers, which read off a merge-sorted single file, to understand how to break the sequential file into reduce calls of key, values[]. Typically one

Re: Partitioner vs GroupComparator

2013-08-23 Thread Jan Lukavský
Hi all, when speaking about this, has anyone ever measured how much more data needs to be transferred over the network when using GroupingComparator the way Harsh suggests? What do I mean, when you use the GroupingComparator, it hides you the real key that you have emitted from Mapper. You

Hadoop upgrade

2013-08-23 Thread Viswanathan J
Hi, We are planning to upgrade our production hdfs cluster from 1.0.4 to 1.2.1 So if I directly upgrade the cluster, it won't affect the edits, fsimage and checkpoints? Also after upgrade is it will read the blocks, files from the data nodes properly? Is the version id conflict occurs with NN?

RE: yarn-site.xml and aux-services

2013-08-23 Thread John Lilley
Are there recommended conventions for adding additional code to a stock Hadoop install? It would be nice if we could piggyback on whatever mechanisms are used to distribute hadoop itself around the cluster. john From: Vinod Kumar Vavilapalli [mailto:vino...@hortonworks.com] Sent: Thursday,

Re: yarn-site.xml and aux-services

2013-08-23 Thread Harsh J
The general practice is to install your deps into a custom location such as /opt/john-jars, and extend YARN_CLASSPATH to include the jars, while also configuring the classes under the aux-services list. You need to take care of deploying jar versions to /opt/john-jars/ contents across the cluster

Re: Partitioner vs GroupComparator

2013-08-23 Thread Shahab Yunus
@Jan, why not, not send the 'hidden' part of the key as a value? Why not then pass value as null or with some other value part. So in the reducer side there is no duplication and you can extract the 'hidden' part of the key yourself (which should be possible as you will be encapsulating it in a

Re: Partitioner vs GroupComparator

2013-08-23 Thread Lukavsky, Jan
Hi Shahab, I'm not sure if I understand right, but the problem is that you need to put the data you want to secondary sort into your key class. But, what I just realized is that the original key probably IS accessible, because of the Writable semantics. As you iterate through the Iterable

Re: Partitioner vs GroupComparator

2013-08-23 Thread Shahab Yunus
Jan is that you need to put the data you want to secondary sort into your key class. Yes but then you can also don't put the secondary sort column/data piece in the value part and this way there will be no duplication. But, what I just realized is that the original key probably IS accessible,

Re: Partitioner vs GroupComparator

2013-08-23 Thread Lukavsky, Jan
Hi Shahab, thanks, I just missed the fact that the key gets updated while iterating the values. Although working with Hadoop for three years there is always something that can surprise you. :-) Cheers, Jan Original message Subject: Re: Partitioner vs GroupComparator From:

Getting HBaseStorage() to work in Pig

2013-08-23 Thread Botelho, Andrew
I am trying to use the function HBaseStorage() in my Pig code in order to load an HBase table into Pig. When I run my code, I get this error: ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable I believe the PIG_CLASSPATH needs to be extended to

Re: Getting HBaseStorage() to work in Pig

2013-08-23 Thread Ted Yu
Please look at the example in 15.1.1 under http://hbase.apache.org/book.html#tools On Fri, Aug 23, 2013 at 1:41 PM, Botelho, Andrew andrew.bote...@emc.comwrote: I am trying to use the function HBaseStorage() in my Pig code in order to load an HBase table into Pig. When I run my

RE: Getting HBaseStorage() to work in Pig

2013-08-23 Thread Botelho, Andrew
Could you explain what is going on here: HADOOP_CLASSPATH=`${HBASE_HOME}/bin/hbase classpath` ${HADOOP_HOME}/bin/hadoop jar ${HBASE_HOME}/hbase-VERSION.jar I'm not a Unix expert by any means. How can I use this to enable HBaseStorage() in Pig? Thanks, Andrew From: Ted Yu

Re: Getting HBaseStorage() to work in Pig

2013-08-23 Thread Shahab Yunus
You are here running multiple UNIX commands and the end result or the end command is to run hbase-YOUR VERSION.jar using hadoop's *jar* command. So basically you add HBase jars to the classpath of your Hadoop environment and then execute hbase tools using hadoop. If you get the message as

CVE-2013-2192: Apache Hadoop Man in the Middle Vulnerability

2013-08-23 Thread Aaron T. Myers
Hello, Please see below for the official announcement of a serious security vulnerability which has been discovered and subsequently fixed in Apache Hadoop releases. Best, Aaron -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 CVE-2013-2192: Apache Hadoop Man in the Middle Vulnerability

Writable readFields question

2013-08-23 Thread Ken Sullivan
For my application I'm decoding data in readFields() of non-predetermined length. I've found parsing for 4 (ASCII End Of Transmission) or -1 tend to mark the end of the data stream. Is this reliable, or is there a better way? Thanks, Ken

How to specify delimiter in Hive select query

2013-08-23 Thread Shailesh Samudrala
I'm querying a Hive table on my cluster, (*select * from table_name;*) and writing this select output to output file using (*INSERT OVERWRITE DIRECTORY*). When the I open the output file, I see that the columns are delimited by Hive's default delimiter (*^A or ctrl-A*). So my question is, is

Re: How to specify delimiter in Hive select query

2013-08-23 Thread Jagat Singh
This is not possible till hive 0.10 in hive 0.11 there is patch to do.this. You can insert to some temp.table with required delimiter or use some pig action to.replace afterwards. Or best use hive 0.11 On 24/08/2013 10:45 AM, Shailesh Samudrala shailesh2...@gmail.com wrote: I'm querying a Hive

Re: Passing parameters to MapReduce through Oozie

2013-08-23 Thread Harsh J
If you use mapreduce-action, then is the configuration lookup done within a Mapper/Reducer? If so, how are you grabbing the configuration object? Via the overridden configure(JobConf conf) method? On Fri, Aug 23, 2013 at 11:07 PM, Shailesh Samudrala shailesh2...@gmail.com wrote: Hello, I am

io.file.buffer.size different when not running in proper bash shell?

2013-08-23 Thread Nathan Grice
Thanks in advance for any help. I have been banging my head against the wall on this one all day. When I run the cmd: hadoop fs -put /path/to/input /path/in/hdfs from the command line, the hadoop shell dutifully copies my entire file correctly, no matter the size. I wrote a webservice client for

Re: Writable readFields question

2013-08-23 Thread Harsh J
When you're encoding/write()-ing the writable, do you not know the length? If you do, store the length first, and you can solve your problem? On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan sulli...@mayachitra.com wrote: For my application I'm decoding data in readFields() of non-predetermined

Pig upgrade

2013-08-23 Thread Viswanathan J
Hi, I'm planning to upgrade pig version from 0.8.0 to 0.11.0, hope this is stable release. So what are the improvements, key features, benefits, advantages by upgrading this? Thanks, Viswa.J

Re: Pig upgrade

2013-08-23 Thread Harsh J
The Apache Pig's own lists at u...@pig.apache.org is the right place to ask this. On Sat, Aug 24, 2013 at 10:22 AM, Viswanathan J jayamviswanat...@gmail.com wrote: Hi, I'm planning to upgrade pig version from 0.8.0 to 0.11.0, hope this is stable release. So what are the improvements, key

Re: Pig upgrade

2013-08-23 Thread Viswanathan J
Thanks a lot. On Aug 24, 2013 10:38 AM, Harsh J ha...@cloudera.com wrote: The Apache Pig's own lists at u...@pig.apache.org is the right place to ask this. On Sat, Aug 24, 2013 at 10:22 AM, Viswanathan J jayamviswanat...@gmail.com wrote: Hi, I'm planning to upgrade pig version from