Hi all,
I'm running with Hadoop 1.0.4 and HBase 0.94.12 bundled (OSGi) versions I
built.
Most issues I encountered are related to class loaders.
One of the patterns I noticed in both projects is:
ClassLoader cl = Thread.currentThread().getContextClassLoader();
if(cl == null) {
cl
I've been using Fair Scheduler with Hadoop 1.0.4 for f few months now with
no issues what so ever.
All of a sudden I have a problem where jobs are in status UNASSIGNED. Jobs
submitted are pending for map/reduce slots although the cluster resources
are free.
In some of the pools only map slots are a
Hi all,
I'm running a mapreduce job that has custom counters incremented in the
combiner's reduce function.
Looking at the mapreduce web UI I see that, like all counters, its has
three columns: Map, Reduce and Total.
>From what I know, the combiner is executed on the map output, hence runs in
Mapp
ng of the combiner is that it is like a “mapper-side
>> pre-reducer” and operates on blocks of data that have already been sorted
>> by key, so mucking with the keys doesn’t **seem** like a good idea.
>>
>> john
>>
>>
>>
>> *From:* Amit Sela [mailto:a
Hi all,
I was wondering if it is possible to manipulate the key during combine:
Say I have a mapreduce job where the key has many qualifiers.
I would like to "split" the key into two (or more) keys if it has more
than, say 100 qualifiers.
In the combiner class I would do something like:
int coun
2 PM, Ted Yu wrote:
>
>> Please take a look at
>> http://hbase.apache.org/book.html#snappy.compression
>>
>> Cheers
>>
>>
>> On Wed, Jan 1, 2014 at 8:05 AM, Amit Sela wrote:
>>
>>> Hi all,
>>>
>>>
Hi all,
I'm running on Hadoop 1.0.4 and I'd like to use Snappy for map output
compression.
I'm adding the configurations:
configuration.setBoolean("mapred.compress.map.output", true);
configuration.set("mapred.map.output.compression.codec",
"org.apache.hadoop.io.compress.SnappyCodec");
And I've
I would like to add new machines to my existing cluster but they won't be
similar to the current nodes. I have to scenarios I'm thinking of:
1. What are the implications (besides initial load balancing) of adding a
new node to the cluster, if this node runs on a machine similar to all
other nodes
Hi all,
I'm using hadoop 1.0.4 and using gzip to keep the logs processed by hadoop
(logs are gzipped into block size files).
I read that bzip2 is splittable. Is it so in hadoop 1.0.4 ? Does that mean
that any input file bigger then block size will be split between maps ?
What are the tradeoffs betw
Hi all,
I was wondering if there is a way to let fair scheduler ignore the user and
submit a job to a specific pool.
I would like to have 3/4 pools:
1. Very short (~1 min) routine jobs.
2. Normal processing time (<1 hr) routine jobs.
3. Long (days) experimental jobs.
4. ? ad hoc immediate jobs ?
Hi everyone,
I'm running Hadoop 1.0.4 on a modest cluster (~20 machines) and I would
like to divide my cluster resources by job's process time.
The jobs running on the cluster can be divided as follows:
1. Very short jobs: less then 1 minute.
2. Normal jobs: 2-3 minutes up to an hour or two.
3. V
Hi all,
I'm running Hadoop 1.0.4 on a modest cluster (~20 machines).
The jobs running on the cluster can be divided (resource wise) as follows:
1. Very short jobs: less then 1 minute.
2. Normal jobs: 2-3 minutes up to an hour or two.
3. Very long jobs: days of processing. (still not active and th
Hi all,
I'm running Hadoop 1.0.4 on a modest cluster (~20 machines).
The jobs running on the cluster can be divided (resource wise) as follows:
1. Very short jobs: less then 1 minute.
2. Normal jobs: 2-3 minutes up to an hour or two.
3. Very long jobs: days of processing. (still not active and th
Sorry, Gmail tab error, please disregard and I will re-send, Thanks.
On Sat, Jul 6, 2013 at 5:02 PM, Amit Sela wrote:
> Hi all,
>
> I'm running Hadoop 1.0.4 on a modest cluster (~20 machines).
> The jobs running on the cluster can be divided (resource wise) as follows:
>
>
Hi all,
I'm running Hadoop 1.0.4 on a modest cluster (~20 machines).
The jobs running on the cluster can be divided (resource wise) as follows:
Hi all,
I'm trying to run ant test on a clean Hadoop branch-1 checkout.
ant works fine but when I run ant test I get a lot of failures:
Test org.apache.hadoop.cli.TestCLI FAILED
Test org.apache.hadoop.fs.TestFileUtil FAILED
Test org.apache.hadoop.fs.TestHarFileSystem FAILED
Test org.apache.hadoop
issues.apache.org/jira/browse/HADOOP-6103, although the fix
> never made it into branch-1. Can you create a branch-1 patch for this
> please?
>
> Thanks,
> Tom
>
> On Thu, Apr 18, 2013 at 4:09 AM, Amit Sela wrote:
> > Hi all,
> >
> > I was wondering if there is
Hi all,
I was wondering if there is a good reason why public
Configuration(Configuration other) constructor in Hadoop 1.0.4 doesn't
clone the classloader in "other" to the new Configration ?
Is this a bug ?
I'm asking because I'm trying to run a Hadoop client in OSGI environment
and I need to pa
Hi all,
I'm trying to setup an Hadoop client for job submissions (and more) as an
OSGI bundle.
I came over a lot of hardships but I'm kinda stuck now.
When I create a new Job for submission I setClassLoader() for the Job
Configuration so that it would use the bundle's ClassLoader (felix), but
w
11_02job_201304150711_37 while the
webapp shows 11 submissions that were actually executed (not remotely...)
On Wed, Apr 17, 2013 at 6:40 AM, Zizon Qiu wrote:
> try use job.waitFromComplete(true) instead of job.submit().
> it should show more details.
>
>
> On Mon, Apr 15, 2013 a
Nothing on JT log, but as I mentioned I see this in the client log:
[WARN ] org.apache.hadoop.mapred.JobClient » Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
[INFO ] org.apache.hadoop.mapred.JobClient » Cleaning up the staging
are
Reading my own message I understand that maybe it's not clear so just to
clarify - the previously mentioned JT ID is indeed the correct ID.
Thanks.
On Apr 15, 2013 4:35 PM, "Amit Sela" wrote:
> This is the JT ID and there is no problem running jobs from command line,
> jus
This is the JT ID and there is no problem running jobs from command line,
just remote.
On Apr 15, 2013 4:24 PM, "Harsh J" wrote:
> Thats interesting; is the JT you're running on the cluster started
> with the ID 201304150711 or something else?
>
> On Mon, Apr 15,
ing, or the cluster doesn't run anything?
>
> On Mon, Apr 15, 2013 at 3:36 PM, Amit Sela wrote:
> > Hi all,
> >
> > I'm trying to submit a mapreduce job remotely using job.submit()
> >
> > I get the following:
> >
> > [WARN ] org.apache
Hi all,
I'm trying to submit a mapreduce job remotely using job.submit()
I get the following:
[WARN ] org.apache.hadoop.mapred.JobClient » Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
[INFO ] org.apache.hadoop.mapred.JobClient »
10x
On Wed, Mar 13, 2013 at 1:56 PM, Azuryy Yu wrote:
> dont wait patch, its a very simple fix. just do it.
> On Mar 13, 2013 5:04 PM, "Amit Sela" wrote:
>
>> But the patch will work on 1.0.4 correct ?
>>
>> On Wed, Mar 13, 2013 at 4:57 AM, George Datsk
on for this bug 1.1.2
>
>
> George
>
>
> or https://issues.apache.org/jira/browse/MAPREDUCE-4857
>
> Which is fixed in 1.0.4
>
> ** **
>
> ** **
>
> *From:* Amit Sela [mailto:am...@infolinks.com ]
> *Sent:* Tuesday, March 12, 2013 5:08 AM
> *
houldn't differ from 1.0.3 that much no ?)
Thanks!
On Tue, Mar 12, 2013 at 1:40 PM, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> wrote:
> Hi Amit,
>
> Which Hadoop version are you using?
>
> I have been told it's because of
> https://issues.apache.org/jira/bro
Hi all,
I have a weird failure occurring every now and then during a MapReduce job.
This is the error:
*java.lang.Throwable: Child Error*
* at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)*
*Caused by: java.io.IOException: Task process exit with nonzero status of
255.*
* at org.ap
Hi all,
I'm implementing an API over the JobTracker client - JobClient.
My plan is to have a pool of JobClient objects that will expose the ability
to submit jobs, poll status etc.
My question is: Should I set a maximum pool size ? How many connections
aree too many connection for the JobTracker
; text.writeFields(out);
> } else {
> integer.writeFields(out);
> }
> }
>
> [... readFields method that works in a similar way]
> }
>
> -Sandy
>
> On Sun, Feb 10, 2013 at 4:00 AM, Amit Sela wrote:
>
>> Hi all,
>>
>> Has anyo
Hi all,
Has anyone ever used some kind of a "generic output key" for a mapreduce
job ?
I have a job running multiple tasks and I want them to be able to use both
Text and IntWritable as output key classes.
Any suggestions ?
Thanks,
Amit.
d not on the cluster.
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> --
> *From: * Amit Sela
> *Date: *Thu, 24 Jan 2013 18:15:49 +0200
> *To: *
> *ReplyTo: * user@hadoop.apache.org
> *Subject: *Re: Submitting Ma
t; you're looking for.
>
> On Thu, Jan 24, 2013 at 5:43 PM, Amit Sela wrote:
> > Hi all,
> >
> > I want to run a MapReduce job using the Hadoop Java api from my analytics
> > server. It is not the master or even a data node but it has the same
> Hadoop
>
Hi all,
I want to run a MapReduce job using the Hadoop Java api from my analytics
server. It is not the master or even a data node but it has the same Hadoop
installation as all the nodes in the cluster.
I tried using JobClient.runJob() but it accepts JobConf as argument and
when using JobConf it
Hi all,
I was wondering if anyone here tried using the GPU of a Hadoop Node to
enhance MapReduce processing ?
I read about it but it always comes down to heavy computations such as
Matrix multiplications and Mote Carlo algorithms.
Did anyone try it with MapReduce jobs that analyze logs or any ot
E-4451, has been resolved for 1.2.0.
>
> On Tue, Nov 27, 2012 at 3:20 PM, Amit Sela wrote:
> > Hi Jon,
> >
> > I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.0.4
> > and I haven't noticed any performance issues. By &qu
Hi Jon,
I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.0.4
and I haven't noticed any performance issues. By "multiple assignment
feature" do you mean speculative execution
(mapred.map.tasks.speculative.execution
and mapred.reduce.tasks.speculative.execution) ?
On Mon, Nov
Hi everyone,
Anyone knows if the new corona tools (Facebook just released as open
source) are compatible with hadoop 1.0.x ? or just 0.20.x ?
Thanks.
Hi all,
I want to upgrade a 1TB cluster from hadoop 0.20.3 to hadoop 1.0.3.
I am interested to know how long does the hdfs upgrade take and in general
how long it takes from deploying new versions until the cluster is back to
running heavy MapReduce ?
I'd also appreciate it if someone could elab
Hi all,
I want to upgrade a 1TB cluster from hadoop 0.20.3 to hadoop 1.0.3.
I am interested to know how long does the hdfs upgrade take and in general
how long it takes from deploying new versions until the cluster is back to
running heavy MapReduce ?
I'd also appreciate it if someone could elab
41 matches
Mail list logo