2.x.x provides NN high availability.
http://hadoop.apache.org/docs/r2.0.3-alpha/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html
However, it is in alpha stage right now.
Thanks
hemanth
On Sat, Apr 20, 2013 at 5:30 PM, Ascot Moss wrote:
> Hi,
>
> I am new to Hadoop, from Hadoop do
+ user@
Please do continue the conversation on the mailing list, in case others
like you can benefit from / contribute to the discussion
Thanks
Hemanth
On Sat, Apr 20, 2013 at 5:32 PM, Hemanth Yamijala wrote:
> Hi,
>
> My code is working with having mrunit-0.9.0-incubating-hadoop1
Sorry - no. I just wanted to know if you were using FUSE, because I knew of
no other way of mounting HDFS.. Basically was wondering if some libraries
needed to be system path for the Java programs to work.
>From your response looks like you aren't using FUSE. So what are you using
to mount ?
Hema
Hi,
If your goal is to use the new API, I am able to get it to work with the
following maven configuration:
org.apache.mrunit
mrunit
0.9.0-incubating
hadoop1
If I switch with classifier hadoop2, I get the same errors as what you
facing.
Thanks
Hemanth
On Sat,
As this is a HBase specific question, it will be better to ask this
question on the HBase user mailing list.
Thanks
Hemanth
On Fri, Apr 19, 2013 at 10:46 PM, Adrian Acosta Mitjans <
amitj...@estudiantes.uci.cu> wrote:
> Hello:
>
> I'm working in a proyect, and i'm using hbase for storage the da
Are you using Fuse for mounting HDFS ?
On Fri, Apr 19, 2013 at 4:30 PM, lijinlong wrote:
> I mounted HDFS to a local directory for storage,that is /mnt/hdfs.I can do
> the basic file operation such as create ,remove,copy etc just using linux
> command and GUI.But when I tried to do the same thi
/mnt/san1 - owned by aye, hadmin and user mapred is trying to write to this
directory. Can you look at your core-, hdfs- and mapred-site.xml to see
where /mnt/san1 is configured as a value - that might make it more clear
what needs to be changed.
I suspect this could be one of the system directori
Are you trying to implement something like namespace federation, that's a
part of Hadoop 2.0 -
http://hadoop.apache.org/docs/r2.0.3-alpha/hadoop-project-dist/hadoop-hdfs/Federation.html
On Thu, Apr 18, 2013 at 10:02 PM, Lixiang Ao wrote:
> Actually I'm trying to do something like combining mult
* **
>
> Thanks,
>
> ** **
>
> Jane
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com]
> *Sent:* Wednesday, April 17, 2013 9:11 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to configure mapreduce archive size?**
I don't think that is possible. When we use -getmerge, the destination
filesystem happens to be a LocalFileSystem which extends from
ChecksumFileSystem. I believe that's why the CRC files are getting in.
Would it not be possible for you to ignore them, since they have a fixed
extension ?
Thanks
H
; I will contact them again.
>
> ** **
>
> Thanks,****
>
> ** **
>
> Jane
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com]
> *Sent:* Tuesday, April 16, 2013 9:35 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to configur
ou help?
>
> ** **
>
> Thanks.
>
> ** **
>
> Xia
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com]
> *Sent:* Thursday, April 11, 2013 9:09 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: How to configure map
onfiguration().set(TableOutputFormat.*OUTPUT_TABLE*,
> tableName);
>
>job.setNumReduceTasks(0);
>
>
>
>*boolean* b = job.waitForCompletion(*true*);
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com]
> *Sent:* Thursday
AFAIK, the cp command works fully from the DFS client. It reads bytes from
the InputStream created when the file is opened and writes the same to the
OutputStream of the file. It does not work at the level of data blocks. A
configuration io.file.buffer.size is used as the size of the buffer used in
oot/mapred/local/archive already goes more than 1G now. Looks
> like it does not do the work. Could you advise if what I did is correct?**
> **
>
>
>
> local.cache.size
>
> 50
>
>
>
> Thanks,
>
>
>
> Xia
>
>
Hi,
This directory is used as part of the 'DistributedCache' feature. (
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#DistributedCache).
There is a configuration key "local.cache.size" which controls the amount
of data stored under DistributedCache. The default limit is 10GB. However,
t?
>
> Alberto
>
> On 28 March 2013 13:12, Hemanth Yamijala
> wrote:
> > Hmm. That feels like a join. Can't you read the input file on the map
> side
> > and output those keys along with the original map output keys.. That way
> the
> > reducer would aut
eys a
> particular reducers will receive.
> So, my intention is to know the keys in the setup method to store only
> the needed lines.
>
> Thanks,
> Alberto
>
>
> On 28 March 2013 11:01, Hemanth Yamijala
> wrote:
> > Hi,
> >
> > Not sure if
Hi,
Not sure if I am answering your question, but this is the background. Every
MapReduce job has a partitioner associated to it. The default partitioner
is a HashPartitioner. You can as a user write your own partitioner as well
and plug it into the job. The partitioner is responsible for splittin
I don't think it is documented in mapred-default.xml, where it should
ideally be. I could see it only in code. You can take a look at it here, if
you are interested: http://goo.gl/k5xsI
Thanks
Hemanth
On Wed, Mar 27, 2013 at 7:07 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:
> Oh! g
or="./dump.sh"
> attempt_201302211510_81218_m_00_0: # Executing /bin/sh -c
> "./dump.sh"...
> attempt_201302211510_81218_m_00_0: put: File myheapdump.hprof does not
> exist.
> attempt_201302211510_81218_m_00_0: log4j:WARN No appenders could b
yList.ensureCapacity(ArrayList.java:167)
> at java.util.ArrayList.add(ArrayList.java:351)
> at
> com.hadoop.publicationMrPOC.PublicationMapper.configure(PublicationMapper.java:59)
> ... 22 more
>
>
>
>
>
> On Wed, Mar 27, 2013 at 10:16 AM, Hemanth
The stack trace indicates the job client is trying to submit a job to the
MR cluster and it is failing. Are you certain that at the time of
submitting the job, the JobTracker is running ? (On localhost:54312) ?
Regarding using a different file system - it depends a lot on what file
system you are
pdump.hprof -XX:OnOutOfMemoryError=./dump.sh'
>
> This should create the heap dump on hdfs at /tmp/myheapdump_knoguchi.
>
> Koji
>
>
> On Mar 26, 2013, at 11:53 AM, Hemanth Yamijala wrote:
>
> > Hi,
> >
> > I tried to use the -XX:+HeapDumpOnOutOfMemoryE
matching
a pattern. However, these are NOT retaining the current working directory.
Hence, there is no option to get this from a cluster AFAIK.
You are effectively left with the jmap option on pseudo distributed cluster
I think.
Thanks
Hemanth
On Tue, Mar 26, 2013 at 11:37 AM, Hemanth Yamijala
aintained by third party.
> I only have have a edge node through which I can submit the jobs.
>
> Is there any other way of getting the dump instead of physically going to
> that machine and checking out.
>
>
>
> On Tue, Mar 26, 2013 at 10:12 AM, Hemanth Yamijala <
. So I am trying to read
> the whole file and load it into list in the mapper.
>
> For each and every record Iook in this file which I got from distributed
> cache.
>
> —
> Sent from iPhone
>
>
> On Mon, Mar 25, 2013 at 6:39 PM, Hemanth Yamijala <
> yhema...@t
tried out your suggestion loading 420 MB file into memory. It threw java
> heap space error.
>
> I am not sure where this 1.6 GB of configured heap went to ?
>
>
> On Mon, Mar 25, 2013 at 12:01 PM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Hi,
Hi,
The free memory might be low, just because GC hasn't reclaimed what it can.
Can you just try reading in the data you want to read and see if that works
?
Thanks
Hemanth
On Mon, Mar 25, 2013 at 10:32 AM, nagarjuna kanamarlapudi <
nagarjuna.kanamarlap...@gmail.com> wrote:
> io.sort.mb = 256
Which version of Hadoop are you using. A quick search shows me a bug
https://issues.apache.org/jira/browse/HADOOP-5241 that seems to show
similar symptoms. However, that was fixed a long while ago.
On Sat, Mar 23, 2013 at 4:40 PM, Redwane belmaati cherkaoui <
reduno1...@googlemail.com> wrote:
>
Any MapReduce task needs to communicate with the tasktracker that launched
it periodically in order to let the tasktracker know it is still alive and
active. The time for which silence is tolerated is controlled by a
configuration property mapred.task.timeout.
It looks like in your case, this has
hich says it is for backporting 3357 to branch 0.23
>
> So, I don't understand whether the fix is really in 2.0.0-alpha, request
> you to please clarify me.
>
> Thanks,
> Kishore
>
>
>
>
>
> On Thu, Mar 21, 2013 at 9:57 AM, Hemanth Yamijala <
&g
There was an issue related to hung connections (HDFS-3357). But the JIRA
indicates the fix is available in Hadoop-2.0.0-alpha. Still, would be worth
checking on Sandy's suggestion
On Wed, Mar 20, 2013 at 11:09 PM, Sandy Ryza wrote:
> Hi Kishore,
>
> 50010 is the datanode port. Does your lsof ind
7;d
> rather keep it like this if I can make it work.
>
> Any idea besides hadoop version?
>
> Thanks!
>
> Lucas
>
>
>
> On Sat, Feb 23, 2013 at 11:54 AM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Hi Lucas,
>>
>> I tried somet
>
> hadoop -fs -tail works just fine, and reading the file using
> org.apache.hadoop.fs.FSDataInputStream also works ok.
>
> Last thing, the web interface doesn't see the contents, and command hadoop
> -fs -ls says the file is empty.
>
> What am I doing wrong?
>
>
Yes. It corresponds to the JT start time.
Thanks
hemanth
On Sat, Feb 23, 2013 at 5:37 PM, Manoj Babu wrote:
> Bharath,
> I can understand that its time stamp.
> what does identifier means? whether is holds the job tracker instance
> started time?
>
> Cheers!
> Manoj.
>
>
> On Sat, Feb 23, 2013
Can you try this ? Pick a class like WordCount from your package and
execute this command:
javap -classpath -verbose org.myorg.Wordcount | grep
version.
For e.g. here's what I get for my class:
$ javap -verbose WCMapper | grep version
minor version: 0
major version: 50
Please paste the out
Could you please clarify, are you opening the file in your mapper code and
reading from there ?
Thanks
Hemanth
On Friday, February 22, 2013, Lucas Bernardi wrote:
> Hello there, I'm trying to use hadoop map reduce to process an open file. The
> writing process, writes a line to the file and sync
Supporting a multiuser scenario like this is always hard under Hadoop.
There are a few configuration knobs that offer some administrative control
and protection.
Specifically for the problem you describe, you could probably set
Mapreduce.{map|reduce}.child.ulimit on the tasktrackers, so that any j
I may be guessing here a bit. Basically a filesystem is identified by the
protocol part of the URI of a file - so a file on the S3 filesystem will
have a URI like s3://... If you look at the core-default.xml file in Hadoop
source, you will see configuration keys like fs..impl and the
value is a cla
from occurring, by disallowing the in-memory shuffle from using
> up all the JVM heap.
>
> Is it possible that the continued existence of this OutOfMemoryError
> represents a bug in ShuffleRamManager, or in some other code that is
> intended to prevent this situation from occurring?
&
There are a few tweaks In configuration that may help. Can you please look
at
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Shuffle%2FReduce+Parameters
Also, since you have mentioned reducers are unbalanced, could you use a
custom partitioner to balance out the outputs. Or just increas
Hi,
In the past, some tests have been flaky. It would be good if you can search
jira and see whether this is a known issue. Else, please file it, and if
possible, provide a patch. :)
Regarding whether this will be a reliable build, it depends a little bit on
what you are going to use it for. For
, 2013, Fatih Haltas wrote:
> Yes i reorganized the packages but still i am getting same error my hadoop
> version is 1.0.4
>
> 19 Şubat 2013 Salı tarihinde Hemanth Yamijala adlı kullanıcı şöyle yazdı:
>
> I am not sure if that will actually work, because the class is defined to
>
a:266)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>
>
>
> On Tue, Feb 19, 2013 at 8:10 PM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
> Sorry. I did not read the mail correctly. I think the error is in how the
> jar has been
Sorry. I did not read the mail correctly. I think the error is in how the
jar has been created. The classes start with root as wordcount_classes,
instead of org.
Thanks
Hemanth
On Tuesday, February 19, 2013, Hemanth Yamijala wrote:
> Have you used the Api setJarByClass in your main prog
Have you used the Api setJarByClass in your main program?
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/Job.html#setJarByClass(java.lang.Class)
On Tuesday, February 19, 2013, Fatih Haltas wrote:
> Hi everyone,
>
> I know this is the common mistake to not specify the class
Hemanth sir. BTW, what exactly is
> the kind of processing which you are planning to do on your data.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Tue, Feb 19, 2013 at 6:44 AM, Hemanth Yamijala <
> yhema...@thoughtworks.co
08,
> and we dont need to develop a professional app, we just need to develop it
> fast and make our experiment result soon.
> Thanks
>
>
> On 02/18/2013 11:58 PM, Hemanth Yamijala wrote:
>
> What database is this ? Was hbase mentioned ?
>
> On Monday, February 18, 2013,
What database is this ? Was hbase mentioned ?
On Monday, February 18, 2013, Mohammad Tariq wrote:
> Hello Masoud,
>
> You can use the Bulk Load feature. You might find it more
> efficient than normal client APIs or using the TableOutputFormat.
>
> The bulk load feature uses a MapReduce
Hi,
It may be useful to post this question on the oozie user mailing list.
There are likely to be more expert users there. u...@oozie.apache.org
Thanks
Hemanth
On Friday, February 15, 2013, anand verma wrote:
> Hi,
>
> I am struggling for many days to install Oozie 3.3.1 on Hadoop 1.1.1.
> Oozi
This seems to be related to the % used capacity at a datanode. The values
are computed for all the live datanodes, and the range / central limits /
deviations are computed based on a sorted list of the values.
Thanks
hemanth
On Thu, Feb 14, 2013 at 2:42 PM, Dhanasekaran Anbalagan
wrote:
> Hi Gu
Can you please include the complete stack trace and not just the root.
Also, have you set fs.default.name to a hdfs location like
hdfs://localhost:9000 ?
Thanks
Hemanth
On Wednesday, February 13, 2013, Alex Thieme wrote:
> Thanks for the prompt reply and I'm sorry I forgot to include the
> excep
Adding on to the response, looking at the existing source code of
LineRecordReader, which has a similar function to read across HDFS blocks
to align with line boundaries may also help you to write similar code.
Harsh had responded with more specific details as to where to look on the
list before. F
Hi,
Hadoop On Demand is no longer supported with recent releases of Hadoop.
There is no separate user list for HOD related questions.
Which version of Hadoop are you using right now ?
Thanks
hemanth
On Wed, Feb 6, 2013 at 8:59 PM, Mehmet Belgin
wrote:
> Hello again,
>
> Considering that I hav
Previously, I have resolved this error by building a jar and then using the
API job.setJarByClass(.class). Can you please try that once ?
On Thu, Jan 31, 2013 at 6:40 PM, Vikas Jadhav wrote:
> Hi I know it class not found error
> but I have Map and reduce Class as part of Driver class
> So what
er code not to close the FS.
> It will go away when the task ends anyway.
>
> Thx
>
>
> On Thu, Jan 24, 2013 at 5:26 PM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Hi,
>>
>> We are noticing a problem where we get a filesystem closed exc
Hi,
Part answer: you can get the blacklisted tasktrackers using the command
line:
mapred job -list-blacklisted-trackers.
Also, I think that a blacklisted tasktracker becomes 'unblacklisted' if it
works fine after some time. Though I am not very sure about this.
Thanks
hemanth
On Wed, Jan 30,
Could you post the stack trace from the job logs. Also looking at the task
tracker logs on the failed nodes may help.
Thanks
Hemanth
On Friday, January 25, 2013, Terry Healy wrote:
> Running hadoop-0.20.2 on a 20 node cluster.
>
> When running a Map/Reduce job that uses several .jars loaded into
This may beof some use, about how maps are decided:
http://wiki.apache.org/hadoop/HowManyMapsAndReduces
Thanks
Hemanth
On Friday, January 25, 2013, jamal sasha wrote:
> Hi.
> A very very lame question.
> Does numbers of mapper depends on the number of nodes I have?
> How I imagine map-reduce
gt;
> On Fri, Jan 25, 2013 at 6:56 AM, Hemanth Yamijala
> wrote:
> > Hi,
> >
> > We are noticing a problem where we get a filesystem closed exception
> when a
> > map task is done and is finishing execution. By map task, I literally
> mean
> > the MapTask clas
Hi,
We are noticing a problem where we get a filesystem closed exception when a
map task is done and is finishing execution. By map task, I literally mean
the MapTask class of the map reduce code. Debugging this we found that the
mapper is getting a handle to the filesystem object and itself calli
On top of what Bejoy said, just wanted to add that when you submit a job to
Hadoop using the hadoop jar command, the jars which you reference in the
command on the edge/client node will be picked up by Hadoop and made
available to the cluster nodes where the mappers and reducers run.
Thanks
Hemant
t, Reporter reporter) throws
> IOException {
> int sum =* baseSum*;
> while (values.hasNext()) {
> sum += values.next().get();
> }
> output.collect(key, new IntWritable(sum));
> }
> }
>
> On Mon, Jan 21, 2013 at 8:29 PM, Hemanth Yamijala
Hi,
Please note that you are referring to a very old version of Hadoop. the
current stable release is Hadoop 1.x. The API has changed in 1.x. Take a
look at the wordcount example here:
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Example%3A+WordCount+v2.0
But, in principle your meth
Hi,
Not sure how to do it using MRUnit, but should be possible to do this using
a mocking framework like Mockito or EasyMock. In a mapper (or reducer),
you'd use the Context classes to get the DistributedCache files. By mocking
these to return what you want, you could potentially run a true unit t
failed when I tried to open it.
Restarting the daemons helped.
I don't think this problem will come in a normal up-and-running production
cluster.
Thanks
hemanth
On Thu, Jan 17, 2013 at 9:48 AM, Hemanth Yamijala wrote:
> At the place where you get the error, can you cross check what th
At the place where you get the error, can you cross check what the URL is
that is being accessed ? Also, can you compare it with the URL with pages
before this that work ?
Thanks
hemanth
On Thu, Jan 17, 2013 at 1:08 AM, jamal sasha wrote:
> I am inside a network where I need proxy settings to
Hi,
One place where I could find the capacity-scheduler.xml was from source -
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/resources.
AFAIK, the masters file is only used for starting the secondary namenode -
which has in 2.x been replaced by a pr
Hi,
AFAIK, the mapred.local.dir property refers to a set of directories under
which different types of data related to mapreduce jobs are stored - for
e.g. intermediate data, localized files for a job etc. The working
directory for a mapreduce job is configured under a sub directory within
one of
in 2.x and trunk. Could you check if this
provides functionality you require - so we at least know there is new API
support in later versions ?
Thanks
Hemanth
On Mon, Jan 14, 2013 at 7:45 PM, Hemanth Yamijala wrote:
> Hi,
>
> No. I didn't find any reference to a working sample.
.co.uk> wrote:
> Thanks Hemanth
>
> ** **
>
> I appreciate your response
>
> Did you find any working example of it in use? It looks to me like I’d
> still be tied to the old API
>
> Thanks****
>
> Mike
>
> ** **
>
> *From:* Hemanth
To add to that, log aggregation is a feature available with Hadoop 2.0
(where mapreduce is re-written to YARN). The functionality is available via
the History Server:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html
Thanks
hemanth
On Sat, Jan 12, 2013 a
Queues in the capacity scheduler are logical data structures into which
MapReduce jobs are placed to be picked up by the JobTracker / Scheduler
framework, according to some capacity constraints that can be defined for a
queue.
So, given your use case, I don't think Capacity Scheduler is going to
d
11, 2013 at 3:28 PM, Ivan Tretyakov wrote:
> Thanks for replies!
>
> keep.failed.task.files set to false.
> Config of one of the jobs attached.
>
>
> On Fri, Jan 11, 2013 at 5:44 AM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Good point. F
Good point. Forgot that one :-)
On Thu, Jan 10, 2013 at 10:53 PM, Vinod Kumar Vavilapalli <
vino...@hortonworks.com> wrote:
>
>
> Can you check the job configuration for these ~100 jobs? Do they have
> keep.failed.task.files set to true? If so, these files won't be deleted. If
> it doesn't, it c
Is this the same as:
http://stackoverflow.com/questions/6137139/how-to-save-only-non-empty-reducers-output-in-hdfs?
i.e. LazyOutputFormat, etc. ?
On Thu, Jan 10, 2013 at 4:51 PM, Pratyush Chandra <
chandra.praty...@gmail.com> wrote:
> Hi,
>
> I am using s3n as file system. I do not wish to crea
I just verified it with my Hadoop 1.0.2 version
Thanks
Hemanth
>
>
> On Thu, Jan 10, 2013 at 8:18 AM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Hi,
>>
>> The directory name you have provided is
>> /data?/mapred/local/taskTracker/perso
Hi,
The directory name you have provided is
/data?/mapred/local/taskTracker/persona/jobcache/.
This directory is used by the TaskTracker (slave) daemons to localize job
files when the tasks are run on the slaves.
Hence, I don't think this is related to the parameter
"mapreduce.jobtracker.retiredj
Hi,
I am not sure if your complaint is as much about the changing interfaces as
it is about documentation.
Please note that versions prior to 1.0 did not have stable interfaces as a
major requirement. Not by choice, but because the focus was on seemingly
more important functionality, stability, p
>From a user perspective, at a high level, the mapreduce package can be
thought of as having user facing client code that can be invoked, extended
etc as applicable from client programs.
The mapred package is to be treated as internal to the mapreduce system,
and shouldn't directly be used unless
Hi,
In Hadoop 1.0, I don't think this information is exposed. The
TaskInProgress is an internal class and hence cannot / should not be used
from client applications. The only way out seems to be to screen scrape the
information from the Jobtracker web UI.
If you can live with completed events, th
Hi,
Are tasks being executed multiple times due to failures? Sorry, it was not
very clear from your question.
Thanks
hemanth
On Sat, Jan 5, 2013 at 7:44 PM, David Parks wrote:
> Thinking here... if you submitted the task programmatically you should be
> able to capture the failure of the task
If it is a small number, A seems the best way to me.
On Friday, December 28, 2012, Kshiva Kps wrote:
>
> Which one is current ..
>
>
> What is the preferred way to pass a small number of configuration
> parameters to a mapper or reducer?
>
>
>
>
>
> *A. *As key-value pairs in the jobconf object.
Hi,
Firstly, I am talking about Hadoop 1.0. Please note that in Hadoop 2.x and
trunk, the Mapreduce framework is completely revamped to Yarn (
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
and you may need to look at different interfaces for building your own
schedu
David,
Could you please tell what version of Hadoop you are using ? I don't see
this parameter in the stable (1.x) or current branch. I only see references
to it with respect to EMR and with Hadoop 0.18 or so.
On Thu, Dec 27, 2012 at 1:51 PM, David Parks wrote:
> I didn’t come up with much in
This is a dated blog post, so it would help if someone with current HDFS
knowledge can validate it:
http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/
.
There is a bit about the RAM required for the Namenode and how to compute
it:
You can look at the 'Namespace
However, in the case Oleg is talking about the attempts are:
attempt_201212051224_0021_m_00_0
attempt_201212051224_0021_m_02_0
attempt_201212051224_0021_m_03_0
These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts,
gh to work out what I had done.
>
> ** **
>
> Dave****
>
> ** **
>
> ** **
>
> *From:* Hemanth Yamijala [mailto:yhema...@thoughtworks.com]
> *Sent:* Thursday, December 06, 2012 3:25 PM
>
> *To:* user@hadoop.apache.org
> *Subject:* Re: Map tasks processing some files
David,
You are using FileNameTextInputFormat. This is not in Hadoop source, as far
as I can see. Can you please confirm where this is being used from ? It
seems like the isSplittable method of this input format may need checking.
Another thing, given you are adding the same input format for all f
Sampath,
You mentioned that the file is present in the tasktracker local dir, could
you please tell us the full path ? I am wondering if setting the full path
will have any impact, rather than specifying the relative path.
Another option may be to try to use the addCacheArchive and createSymLink
Generally true for the framework config files, but some of the
supplementary features can be refreshed without restart. For e.g. scheduler
configuration, host files (for included / excluded nodes) ...
On Tue, Dec 4, 2012 at 5:33 AM, Cristian Cira
wrote:
> No. You will have to restart hadoop. Hot
Hi,
I have not tried this myself before, but would libhdfs help ?
http://hadoop.apache.org/docs/stable/libhdfs.html
Thanks
Hemanth
On Mon, Dec 3, 2012 at 9:52 PM, Wheeler, Bill NPO <
bill.npo.whee...@intel.com> wrote:
> I am trying to use Hadoop’s partitioning/scheduling/storage
> infrastruc
It is coming from the default properties file - mapred-default.xml. The
order of loading configuration in Hadoop is default.xml > site.xml >
job.xml.
mapred.task.tracker.report.address
127.0.0.1:0
The interface and port that task tracker server listens on.
Since it is only connected to by
Hi,
Little confused about where JNI comes in here (you mentioned this in your
original email). Also, where do you want to get the information for the
hadoop job ? Is it in a program that is submitting a job, or some sort of
monitoring application that is monitoring jobs submitted to a cluster by
o
odAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>
>
> On Tue, Oct 16, 2012 at 3:11 AM, Hemanth Yamijala <
> yhema...@thoughtworks.com> wrote:
>
>> Hi,
>>
>> I've n
Hi,
I've not tried this on S3. However, the directory mentioned in the
exception is based on the value of this particular configuration
key: mapreduce.jobtracker.staging.root.dir. This defaults
to ${hadoop.tmp.dir}/mapred/staging. Can you please set this to an S3
location and try ?
Thanks
Hemanth
Hi,
Roughly, this information will be available under the 'Hadoop map task
list' page in the Mapreduce web ui (in Hadoop-1.0, which I am assuming is
what you are using). You can reach this page by selecting the running tasks
link from the job information page. The page has a table that lists all t
Hi,
Could you please share your setup details - i.e. how many slaves, how many
datanodes and tasktrackers. Also, the configuration - in particular
hdfs-site.xml ?
To answer your question: the datanode address is picked up from
hdfs-site.xml, or hdfs-default.xml from the property dfs.datanode.addr
Hi,
Didn't check everything. But found this in the mapred-site.xml:
mapred.job.tracker
hdfs://10.99.42.9:8021/
true
The value shouldn't be a HDFS URL. Can you please fix this and try ?
On Thu, Oct 4, 2012 at 12:32 PM, Ajit Kumar Shreevastava <
ajit.shreevast...@hcl.com> wrote:
> Hi All,**
1 - 100 of 137 matches
Mail list logo