;
org.apache.hadoop.io.LongWritable key =
new org.apache.hadoop.io.LongWritable();
org.apache.hadoop.io.Text value =
new org.apache.hadoop.io.Text();
try {
reader = new SequenceFile.Reader(fs, path, conf);
--
Harsh J
this and provide a
more detailed stack trace or whatever if needed. There may have been some
other fallout from this that I'm not aware of.
I think that's it. Like I said, it was a bit of a mess for awhile but all
seems well now. :)
On Thu, Nov 8, 2012 at 11:26 PM, Harsh J ha...@cloudera.com
be used to compile hadoop mapreduce code in branch-0.23 and
beyond, please use other JDKs.
Is it OK to use OpenJDK 7 in Ubuntu 12.04?
Thanks
--
Harsh J
,
Sigurd
--
Harsh J
as output, including zero.
(b) It accepts a single key-value pair as input and emits a single key and
list of corresponding values as output
regards,
Rams
--
Harsh J
is larget
than 128 MB will it get splitted into blocks and stored in HDFS?
regards,
Rams
--
Harsh J
file into blocks and puts
in HDFS? Usually Image file cannot be splitted right how it is happening in
Hadoop?
regards,
Rams
--
Harsh J
:
https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v1+%28MRv1%29+on+a+Cluster#DeployingMapReducev1%28MRv1%29onaCluster-Step7
Has anyone else encountered this? Let me know if you need more information,
and thanks for your time.
--
Harsh J
use it fully ?
Yes it would, unless you configure the dfs.datanode.du.reserved
config param at each DN to a space value in bytes that must be left
free on all configured volumes.
I still need some place for local files.
Thank you.
Hope this helps!
--
Harsh J
applied differently for an implementation of Reducer class and an
implementation of the Combiner class. This way, you repeat nothing.
Thanks,
Prasad
--
Harsh J
and reduce values. Maybe it is enough in this case? (I have
never used counters inside a combiner so I don't know.)
Regards
Bertrand
On Tue, Nov 6, 2012 at 12:29 PM, Harsh J ha...@cloudera.com wrote:
Hi Prasad,
My reply inline.
On Tue, Nov 6, 2012 at 4:15 PM, Prasad GS gsp200...@gmail.com
interdite. Si vous n'êtes pas le destinataire
prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
Veuillez penser à l'environnement avant d'imprimer le présent courriel
--
Harsh J
and what could be possibly wrong ?
Thanks Regards,
Aseem
--
Harsh J
?
Thanks,
Aseem
On Mon, Nov 5, 2012 at 2:33 AM, Harsh J ha...@cloudera.com wrote:
Sounds like an override issue to me. If you can share your code, we
can take a quick look - otherwise, try annotating your reduce(…)
method with @Override and recompiling to see if it really is the right
.
--
Harsh J
, Nov 1, 2012 at 8:14 PM, Harsh J ha...@cloudera.com wrote:
Hi Dhruv,
Inline.
On Fri, Nov 2, 2012 at 4:15 AM, Dhruv dhru...@gmail.com wrote:
I'm trying to optimize the performance of my OutputFormat's
implementation.
I'm doing things similar to HBase's TableOutputFormat--sending
to that particular reducer? or anything else?)
Any suggestions?
Thanks
--
Harsh J
?
Thanks
Peter
--
Harsh J
,
Thanh Do
--
Harsh J
. The RecordWriter wrapped in it too is only instantiated
once per Task.
Thanks,
Dhruv
--
Harsh J
--
Harsh J
,
and is supported last I checked.
On Wed, Oct 31, 2012 at 9:39 PM, M. C. Srivas mcsri...@gmail.com wrote:
I was under the impression that file-append was deprecated in HDFS.
On Tue, Oct 30, 2012 at 10:13 PM, Harsh J ha...@cloudera.com wrote:
Shiv,
HDFS does have file-append support (i.e. add data at end
option would be to copy part of data into a separate file
and give that to MapReduce but I was wondering if that extra copy can be
avoided.
Thanks,
Pankaj
--
Harsh J
présent
courriel
--
Thanks Regards,
Anil Gupta
--
Harsh J
legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding
--
Harsh J
, just for the hell of it - for fast unit tests, that simulated
lookups and stuff.
So - if the interface is abstract and decoupled enough from any real world
filesystem, i think this could definetly work.
--
Jay Vyas
http://jayunit100.blogspot.com
--
Harsh J
6.90user 0.59system 3:29.17elapsed 3%CPU (0avgtext+0avgdata
819392maxresident)k
0inputs+344outputs (0major+62847minor)pagefaults 0swaps
--
Alexandre Fouche
--
Harsh J
,
--
Nan Zhu
School of Computer Science,
McGill University
--
Harsh J
...
log4j.logger.org.apache.hadoop.mapreduce.LoadIncrementalHFiles=WARN
but no luck.
What am I doing wrong?
Thanks,
Jon
--
Harsh J
)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
--
cente...@gmail.com|Sam
--
Harsh J
-2185
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html
http://blog.csdn.net/chenpingbupt/article/details/7922042
https://issues.apache.org/jira/browse/HADOOP-8163
--
Harsh J
the slow progress in implementation, is better to use the old api?
Thanks.
--
Alberto Cordioli
--
Harsh J
couldn't
conclude the one or the other behavior from the source code and I
couldn't find any documentation about this detail.
Thanks for clarifying!
Sigurd
--
Harsh J
different local name dir and edits dir, thta is ok. Must be the
local name dir and edits dir different?
Thanks,
LiuLei
--
Harsh J
with
hadoop on distributed mode?
--
Harsh J
}
--
Harsh J
should not be considered production-ready.
UNQTE
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Friday, October 19, 2012 1:34 AM
To: user@hadoop.apache.org
Subject: Re: Differences between YARN and Hadoop
Andy,
YARN is NOT MRv2. That seems to be a major confusion
not sure if using Hadoop counters too heavy, there will be performance
downgrade to the whole job?
regards,
Lin
--
Bertrand Dechoux
--
Jay Vyas
http://jayunit100.blogspot.com
--
Harsh J
utilisation, copie ou
divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
Veuillez penser à l'environnement avant d'imprimer le présent courriel
--
Harsh J
? We can't always
estimate the amount of virtual memory needed for our application running on
a container, but we don't want to get it killed in a case it exceeds the
maximum limit.
Please suggest as to how can we come across this issue.
Thanks,
Kishore
--
Harsh J
I filed https://issues.apache.org/jira/browse/YARN-168.
On Thu, Oct 18, 2012 at 5:07 PM, Harsh J ha...@cloudera.com wrote:
This is possible to do, but you've hit a bug with the current YARN
implementation. Ideally you should be able to configure the vmem-pmem
ratio (or an equivalent config
No problem, thanks for closing the loop!
On Thu, Oct 18, 2012 at 8:41 PM, Artem Ervits are9...@nyp.org wrote:
Yup, that was it. I confused this tmp with another tmp we created before.
Thank you.
-Original Message-
From: Harsh J [mailto:ha...@cloudera.com]
Sent: Wednesday, October
divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
Veuillez penser à l'environnement avant d'imprimer le présent courriel
--
Harsh J
?
The two can't be compared this way, see above and my previous post to Andy.
--
Harsh J
further questions, but they may or may not make sense depending
on the answers to the above.
Thanks in advance!
Tom Brown
--
Harsh J
is own by userA.userA). vise versa if I delete
/tmp/hadoop and let the directory be created by userB, userA will not
be able to submit job.
Which is the right approach i should work with?
Please suggest
Patai
On Mon, Oct 15, 2012 at 3:18 PM, Harsh J ha...@cloudera.com wrote:
Hi Patai
have mapred.jobtracker.staging.root.dir set to
/user within HDFS. I can verify the staging files are going there but
something else is still trying to access mapred.system.dir.
Robin Goldstone, LLNL
On 10/17/12 12:00 AM, Harsh J ha...@cloudera.com wrote:
Hi,
Regular users never write
+Daemons
I couldn't find mapred.job.queues from that link so i have been
using mapred.queue.names which might be the case that it is my fault.
Please suggest
On Wed, Oct 17, 2012 at 8:43 AM, Harsh J ha...@cloudera.com wrote:
Hey Robin,
Thanks for the detailed post.
Just looked at your older
to include my class to hadoop streaming at runtime?
Thanks,
Jason
--
Harsh J
steps..
Thanks in advance..
Thanks,
Suneel
Sent from my iphone
--
Harsh J
:12 PM, Yue Guan pipeha...@gmail.com wrote:
Hi, there
Is there any chance set mapred.reducel.tasks=20 doesn't work in
hadoop 0.20.2?
Thanks
Yue
--
Harsh J
it
said under replication.
I thought final keyword will not honor value in job config, but it
doesn't seem so when i run fsck.
I am on cdh3u4.
please suggest.
Patai
--
Harsh J
, or would I be misusing it
and inviting grief?
M
--
Harsh J
Group,
Are there any sample code/documentation available on writing Map-reduce
jobs with secondary sort using Avro data?
--
Thanks,
Ravi
--
Harsh J
/JobConf.html#setQueueName(java.lang.String)
6. Done.
Let us know if this works!
--
Harsh J
Sangbutsarakum
silvianhad...@gmail.com wrote:
Thanks Harsh, dfs.replication.max does do the magic!!
On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth cnaur...@hortonworks.com
wrote:
Thank you, Harsh. I did not know about dfs.replication.max.
On Mon, Oct 15, 2012 at 12:23 PM, Harsh J ha...@cloudera.com
to control who can submit job to a pool.?
Eg. Pool1, can run jobs submitted from any users except userx.
Userx can submit jobs to poolx only. Can't submit to pool1.
Hope this make sense.
Patai
--
Harsh J
:
Is it possible for reducers to start (not just copying, but actually)
reducing before all mappers are done, speculatively?
In particular im asking this because Im curious about the internals of how
the shuffle and sort might (or might not :)) be able to support this.
--
Harsh J
:1288)
--
Harsh J
a new partition at the same time. Is there a risk that the query
could read incomplete or corrupt files? Is there a way to use the _SUCESS
files to prevent this from happening?
Thanks for your time!
Best,
Koert
--
Harsh J
you please suggest Logistic regression package that could be used on
Hadoop ?
I have large data and looking for LR package with kernel supports.
Thanks
Rajesh
--
Harsh J
.--
--
Harsh J
recommendation/solution on this?
thanks,
stephen b
--
Harsh J
secret professionnel. Toute utilisation, copie ou
divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire
prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur.
Veuillez penser à l'environnement avant d'imprimer le présent courriel
--
Harsh J
of the directory in the actual job file itself.
Thanks.
--
Harsh J
regards
Alexey
--
Harsh J
...@gmail.com wrote:
Hello Harsh,
I notices such issues from the start.
Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to
500.
On 10/09/12 11:50 PM, Harsh J wrote:
Hey Alexey,
Have you noticed this right from the start itself? Also, what exactly
do you mean
in the distribued cache??
Thank you,
Mark
--
Harsh J
is authenticated by the kerberos server.
But what about the groups that the user is a member of? Are these simple the
groups that the user is a member of on the namenode machine?
Is it viable to manage access to files on HDFS using groups on a secure
hadoop cluster?
--
Harsh J
a jar for a larger job but only running a version of
wordcount that worked well under 0.2
Any bright ideas???
This is a new 1.03 installation and nothing is known to work
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
cell 206-384-1340
skype lordjoe_com
--
Harsh J
solutions to my problem. I will look at Oozie.
And worst case, I can create a FileSystem instance myself to check whether
the job should be really launched or not. Both could work but both seem
overkill in my context.
--
Harsh J
--
Harsh J
to replace + with max and everything else should work?
J
On Wed, Oct 3, 2012 at 9:52 AM, Harsh J ha...@cloudera.com wrote:
Jeremy,
Here's my shot at it (pardon the quick crappy code):
https://gist.github.com/3828246
Basically - you can achieve it in two ways:
Requirement: All tasks must
local node. I want to chain
multiple reduce functions globally so the data flow looks like: Map -
Reduce - Reduce - Reduce, which means each reduce operation is
followed by a shuffle and sort essentially bypassing the map
operation.
--
Harsh J
on port 9001. There are no errors in the logs, and no mention of that
port,
either.
Obviously, all Map/Reduce examples fail with Connection Refused.
Starting the same cluster using a MapReduce 2 (YARN) configuration works
properly.
Regards,
Alexander
--
Harsh J
, but is not working on another VM. Replacing the
1.4 jar with the 1.7 does seem to fix the problem but this doesn't seem too
sane. Hopefully there is a better alternative.
Thanks!
--
Harsh J
mappers or reducers?
Thanks
J
--
Harsh J
--
Harsh J
the MR1 specific ideas I'd mentioned earlier.
On Wed, Oct 3, 2012 at 12:08 PM, Harsh J ha...@cloudera.com wrote:
Hi,
The classic option exists to provide backward compatibility for users
wanting to run an MR1 cluster (with JT, etc.).
With the inclusion of YARN and MR2 modes of runtime, Apache
previous and
previous.checkpoint. It is very important that we here do not lose data. A
backup is not possible for reasons of cost. Is there eventually an easy way
to test it?
Ulrich
--
Harsh J
.
Regards,
Alexander
--
Harsh J
)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
--
Harsh J
anyone else have other benchmark numbers to share?
--
Harsh J
the Hadoop user lists here.
--
Harsh J
start.sh file in /iReport-4.7.1/bin/
it has /iReport-4.7.1/bin/ireport.exe
Please suggest me how to get it install over ubuntu.
Please suggest me for the linux version
Thanks Regards
Yogesh Kumar
--
Harsh J
dfs.block.size = 64MB
How to increase the the number of concurrent map task ?
Thanks in advance for any assistance !
Shing
--
Harsh J
somebody
elaborate ?
--
Jay Vyas
MMSB/UCHC
--
Harsh J
firewall implemented yet outside cluster so that is not an
option.
Thanks in advance for your help
--
Bertrand Dechoux
Thanks and Regards ,
--
Harsh J
that only set
of users or set of IPs should be able to see the HDFS.
We dont have firewall implemented yet outside cluster so that is not an
option.
Thanks in advance for your help
--
Bertrand Dechoux
Thanks and Regards ,
--
Harsh J
--
Harsh J
and
ChainReducer can be implemented with just a Mapper and a Reducer containing
all the code of the respective chain-implementations. Or am I missing
certain aspects about why they are more than just convenience concepts?
Thanks for clarifying this!
Sigurd
--
Harsh J
is prohibited. If you
have received this electronic message in error, please notify the sender
immediately and destroy the original message and all copies.
--
Harsh J
-communications?
how did you solve this limitation of mapreduce?
thanks,
jane.
--
Harsh J
is
inherently so dynamic, and is built for rapid streaming reads/writes,
which
would be stifled by significant communication overhead.
--
Bertrand Dechoux
--
Harsh J
on
the same.
Thanks in advance,
Nitin
--
Harsh J
/Projects/hadoop-1.0.3/build.xml:618: Execute failed:
java.io.IOException: Cannot run program autoreconf (in directory
/home/xeon/Projects/hadoop-1.0.3/src/native): java.io.IOException:
error=2, No such file or directory
What this error means?
--
Best regards,
--
Harsh J
/UCHC
--
Harsh J
prohibited. If you have received this
communication in error, please notify us immediately by e-mail, and
delete the original message.
--
Harsh J
Oleg.
--
Harsh J
| Software Engineer I | m: +94 719 258 242 |
www.microsoft.com/enterprisesearch
--
Harsh J
801 - 900 of 2355 matches
Mail list logo