Re: Reading from sequence file using java FS api

2012-11-13 Thread Harsh J
; org.apache.hadoop.io.LongWritable key = new org.apache.hadoop.io.LongWritable(); org.apache.hadoop.io.Text value = new org.apache.hadoop.io.Text(); try { reader = new SequenceFile.Reader(fs, path, conf); -- Harsh J

Re: Sticky Bit Problem (CDH4.1)

2012-11-09 Thread Harsh J
this and provide a more detailed stack trace or whatever if needed. There may have been some other fallout from this that I'm not aware of. I think that's it. Like I said, it was a bit of a mess for awhile but all seems well now. :) On Thu, Nov 8, 2012 at 11:26 PM, Harsh J ha...@cloudera.com

Re: Ubuntu 12.04 - Which JDK?

2012-11-08 Thread Harsh J
be used to compile hadoop mapreduce code in branch-0.23 and beyond, please use other JDKs. Is it OK to use OpenJDK 7 in Ubuntu 12.04? Thanks -- Harsh J

Re: Spill file compression

2012-11-07 Thread Harsh J
, Sigurd -- Harsh J

Re: Doubt on Input and Output Mapper - Key value pairs

2012-11-07 Thread Harsh J
as output, including zero. (b) It accepts a single key-value pair as input and emits a single key and list of corresponding values as output regards, Rams -- Harsh J

Re: Doubts on compressed file

2012-11-07 Thread Harsh J
is larget than 128 MB will it get splitted into blocks and stored in HDFS? regards, Rams -- Harsh J

Re: Regarding loading Image file into HDFS

2012-11-07 Thread Harsh J
file into blocks and puts in HDFS? Usually Image file cannot be splitted right how it is happening in Hadoop? regards, Rams -- Harsh J

Re: Sticky Bit Problem (CDH4.1)

2012-11-07 Thread Harsh J
: https://ccp.cloudera.com/display/CDH4DOC/Deploying+MapReduce+v1+%28MRv1%29+on+a+Cluster#DeployingMapReducev1%28MRv1%29onaCluster-Step7 Has anyone else encountered this? Let me know if you need more information, and thanks for your time. -- Harsh J

Re: Capacity of hdfs

2012-11-06 Thread Harsh J
use it fully ? Yes it would, unless you configure the dfs.datanode.du.reserved config param at each DN to a space value in bytes that must be left free on all configured volumes. I still need some place for local files. Thank you. Hope this helps! -- Harsh J

Re: combiner/reducer context in java class

2012-11-06 Thread Harsh J
applied differently for an implementation of Reducer class and an implementation of the Combiner class. This way, you repeat nothing. Thanks, Prasad -- Harsh J

Re: combiner/reducer context in java class

2012-11-06 Thread Harsh J
and reduce values. Maybe it is enough in this case? (I have never used counters inside a combiner so I don't know.) Regards Bertrand On Tue, Nov 6, 2012 at 12:29 PM, Harsh J ha...@cloudera.com wrote: Hi Prasad, My reply inline. On Tue, Nov 6, 2012 at 4:15 PM, Prasad GS gsp200...@gmail.com

Re: incompatible cluster ID - SOLUTION

2012-11-06 Thread Harsh J
interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Harsh J

Re: Task does not enter reduce function after secondary sort

2012-11-04 Thread Harsh J
and what could be possibly wrong ? Thanks Regards, Aseem -- Harsh J

Re: Task does not enter reduce function after secondary sort

2012-11-04 Thread Harsh J
? Thanks, Aseem On Mon, Nov 5, 2012 at 2:33 AM, Harsh J ha...@cloudera.com wrote: Sounds like an override issue to me. If you can share your code, we can take a quick look - otherwise, try annotating your reduce(…) method with @Override and recompiling to see if it really is the right

Re: How to get the file creation date/time of a hdfs file using java hdfs API

2012-11-04 Thread Harsh J
. -- Harsh J

Re: OutputFormat and Reduce Task

2012-11-02 Thread Harsh J
, Nov 1, 2012 at 8:14 PM, Harsh J ha...@cloudera.com wrote: Hi Dhruv, Inline. On Fri, Nov 2, 2012 at 4:15 AM, Dhruv dhru...@gmail.com wrote: I'm trying to optimize the performance of my OutputFormat's implementation. I'm doing things similar to HBase's TableOutputFormat--sending

Re: Low shuffle transfer speeds

2012-11-01 Thread Harsh J
to that particular reducer? or anything else?) Any suggestions? Thanks -- Harsh J

Re: Set the number of maps

2012-11-01 Thread Harsh J
? Thanks Peter -- Harsh J

Re: SequenceFile syncFs behavior?

2012-11-01 Thread Harsh J
, Thanh Do -- Harsh J

Re: OutputFormat and Reduce Task

2012-11-01 Thread Harsh J
. The RecordWriter wrapped in it too is only instantiated once per Task. Thanks, Dhruv -- Harsh J

Re: jobtracker page @50030 timeout or take very long time.

2012-11-01 Thread Harsh J
-- Harsh J

Re: updates in hdfs

2012-10-31 Thread Harsh J
, and is supported last I checked. On Wed, Oct 31, 2012 at 9:39 PM, M. C. Srivas mcsri...@gmail.com wrote: I was under the impression that file-append was deprecated in HDFS. On Tue, Oct 30, 2012 at 10:13 PM, Harsh J ha...@cloudera.com wrote: Shiv, HDFS does have file-append support (i.e. add data at end

Re: Reading part of file using Map Reduce

2012-10-31 Thread Harsh J
option would be to copy part of data into a separate file and give that to MapReduce but I was wondering if that extra copy can be avoided. Thanks, Pankaj -- Harsh J

Re: property mapred.tasktracker.map.tasks.maximum

2012-10-31 Thread Harsh J
présent courriel -- Thanks Regards, Anil Gupta -- Harsh J

Re: Checksum error

2012-10-31 Thread Harsh J
legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding -- Harsh J

Re: Other file systems for hadoop

2012-10-30 Thread Harsh J
, just for the hell of it - for fast unit tests, that simulated lookups and stuff. So - if the interface is abstract and decoupled enough from any real world filesystem, i think this could definetly work. -- Jay Vyas http://jayunit100.blogspot.com -- Harsh J

Re: Insight on why distcp becomes slower when adding nodemanager

2012-10-29 Thread Harsh J
6.90user 0.59system 3:29.17elapsed 3%CPU (0avgtext+0avgdata 819392maxresident)k 0inputs+344outputs (0major+62847minor)pagefaults 0swaps -- Alexandre Fouche -- Harsh J

Re: Access Hadoop Counters

2012-10-29 Thread Harsh J
, -- Nan Zhu School of Computer Science, McGill University -- Harsh J

Re: using log4j to suppress messages

2012-10-29 Thread Harsh J
... log4j.logger.org.apache.hadoop.mapreduce.LoadIncrementalHFiles=WARN but no luck. What am I doing wrong? Thanks, Jon -- Harsh J

Re: java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.(I)V

2012-10-27 Thread Harsh J
) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) -- cente...@gmail.com|Sam -- Harsh J

Re:

2012-10-25 Thread Harsh J
-2185 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html http://blog.csdn.net/chenpingbupt/article/details/7922042 https://issues.apache.org/jira/browse/HADOOP-8163 -- Harsh J

Re: Old vs New API

2012-10-24 Thread Harsh J
the slow progress in implementation, is better to use the old api? Thanks. -- Alberto Cordioli -- Harsh J

Re: Data locality of map-side join

2012-10-24 Thread Harsh J
couldn't conclude the one or the other behavior from the source code and I couldn't find any documentation about this detail. Thanks for clarifying! Sigurd -- Harsh J

Re: AvatarNode configuration

2012-10-23 Thread Harsh J
different local name dir and edits dir, thta is ok. Must be the local name dir and edits dir different? Thanks, LiuLei -- Harsh J

Re: File Permissions on s3 FileSystem

2012-10-23 Thread Harsh J
with hadoop on distributed mode? -- Harsh J

Re: Conflicting mkdirs() behavior in abstract test classes from Hadoop Common

2012-10-23 Thread Harsh J
} -- Harsh J

Re: Differences between YARN and Hadoop

2012-10-23 Thread Harsh J
should not be considered production-ready. UNQTE -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Friday, October 19, 2012 1:34 AM To: user@hadoop.apache.org Subject: Re: Differences between YARN and Hadoop Andy, YARN is NOT MRv2. That seems to be a major confusion

Re: Hadoop counter

2012-10-19 Thread Harsh J
not sure if using Hadoop counters too heavy, there will be performance downgrade to the whole job? regards, Lin -- Bertrand Dechoux -- Jay Vyas http://jayunit100.blogspot.com -- Harsh J

Re: hadoop current properties

2012-10-18 Thread Harsh J
utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Harsh J

Re: Avoid Killing of YARN container on excess virtual memory usage

2012-10-18 Thread Harsh J
? We can't always estimate the amount of virtual memory needed for our application running on a container, but we don't want to get it killed in a case it exceeds the maximum limit. Please suggest as to how can we come across this issue. Thanks, Kishore -- Harsh J

Re: Avoid Killing of YARN container on excess virtual memory usage

2012-10-18 Thread Harsh J
I filed https://issues.apache.org/jira/browse/YARN-168. On Thu, Oct 18, 2012 at 5:07 PM, Harsh J ha...@cloudera.com wrote: This is possible to do, but you've hit a bug with the current YARN implementation. Ideally you should be able to configure the vmem-pmem ratio (or an equivalent config

Re: Hadoop on Isilon problem

2012-10-18 Thread Harsh J
No problem, thanks for closing the loop! On Thu, Oct 18, 2012 at 8:41 PM, Artem Ervits are9...@nyp.org wrote: Yup, that was it. I confused this tmp with another tmp we created before. Thank you. -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Wednesday, October

Re: Differences between YARN and Hadoop

2012-10-18 Thread Harsh J
divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Harsh J

Re: Differences between YARN and Hadoop

2012-10-18 Thread Harsh J
? The two can't be compared this way, see above and my previous post to Andy. -- Harsh J

Re: Differences between YARN and Hadoop

2012-10-18 Thread Harsh J
further questions, but they may or may not make sense depending on the answers to the above. Thanks in advance! Tom Brown -- Harsh J

Re: Fair scheduler.

2012-10-17 Thread Harsh J
is own by userA.userA). vise versa if I delete /tmp/hadoop and let the directory be created by userB, userA will not be able to submit job. Which is the right approach i should work with? Please suggest Patai On Mon, Oct 15, 2012 at 3:18 PM, Harsh J ha...@cloudera.com wrote: Hi Patai

Re: Fair scheduler.

2012-10-17 Thread Harsh J
have mapred.jobtracker.staging.root.dir set to /user within HDFS. I can verify the staging files are going there but something else is still trying to access mapred.system.dir. Robin Goldstone, LLNL On 10/17/12 12:00 AM, Harsh J ha...@cloudera.com wrote: Hi, Regular users never write

Re: Fair scheduler.

2012-10-17 Thread Harsh J
+Daemons I couldn't find mapred.job.queues from that link so i have been using mapred.queue.names which might be the case that it is my fault. Please suggest On Wed, Oct 17, 2012 at 8:43 AM, Harsh J ha...@cloudera.com wrote: Hey Robin, Thanks for the detailed post. Just looked at your older

Re: hadoop streaming with custom RecordReader class

2012-10-17 Thread Harsh J
to include my class to hadoop streaming at runtime? Thanks, Jason -- Harsh J

Re: Hadoop installation on mac

2012-10-16 Thread Harsh J
steps.. Thanks in advance.. Thanks, Suneel Sent from my iphone -- Harsh J

Re: mapred.reduce.tasks doesn't work

2012-10-16 Thread Harsh J
:12 PM, Yue Guan pipeha...@gmail.com wrote: Hi, there Is there any chance set mapred.reducel.tasks=20 doesn't work in hadoop 0.20.2? Thanks Yue -- Harsh J

Re: final the dfs.replication and fsck

2012-10-15 Thread Harsh J
it said under replication. I thought final keyword will not honor value in job config, but it doesn't seem so when i run fsck. I am on cdh3u4. please suggest. Patai -- Harsh J

Re: Suitability of HDFS for live file store

2012-10-15 Thread Harsh J
, or would I be misusing it and inviting grief? M -- Harsh J

Re: Example of secondary sort using Avro data.

2012-10-15 Thread Harsh J
Group, Are there any sample code/documentation available on writing Map-reduce jobs with secondary sort using Avro data? -- Thanks, Ravi -- Harsh J

Re: Fair scheduler.

2012-10-15 Thread Harsh J
/JobConf.html#setQueueName(java.lang.String) 6. Done. Let us know if this works! -- Harsh J

Re: final the dfs.replication and fsck

2012-10-15 Thread Harsh J
Sangbutsarakum silvianhad...@gmail.com wrote: Thanks Harsh, dfs.replication.max does do the magic!! On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth cnaur...@hortonworks.com wrote: Thank you, Harsh. I did not know about dfs.replication.max. On Mon, Oct 15, 2012 at 12:23 PM, Harsh J ha...@cloudera.com

Re: Fair scheduler.

2012-10-13 Thread Harsh J
to control who can submit job to a pool.? Eg. Pool1, can run jobs submitted from any users except userx. Userx can submit jobs to poolx only. Can't submit to pool1. Hope this make sense. Patai -- Harsh J

Re: speculative execution before mappers finish

2012-10-12 Thread Harsh J
: Is it possible for reducers to start (not just copying, but actually) reducing before all mappers are done, speculatively? In particular im asking this because Im curious about the internals of how the shuffle and sort might (or might not :)) be able to support this. -- Harsh J

Re: namenode not in tmp, doesn't start

2012-10-12 Thread Harsh J
:1288) -- Harsh J

Re: concurrency

2012-10-12 Thread Harsh J
a new partition at the same time. Is there a risk that the query could read incomplete or corrupt files? Is there a way to use the _SUCESS files to prevent this from happening? Thanks for your time! Best, Koert -- Harsh J

Re: Logistic regression package on Hadoop

2012-10-12 Thread Harsh J
you please suggest Logistic regression package that could be used on Hadoop ? I have large data and looking for LR package with kernel supports. Thanks Rajesh -- Harsh J

Re: distcp question

2012-10-12 Thread Harsh J
.-- -- Harsh J

Re: Getting hostname (or any environment variable) into *-site.xml files

2012-10-12 Thread Harsh J
recommendation/solution on this? thanks, stephen b -- Harsh J

Re: Issue when clicking on BrowseFileSystem

2012-10-12 Thread Harsh J
secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel -- Harsh J

Re: Referencing files in job file from code

2012-10-11 Thread Harsh J
of the directory in the actual job file itself. Thanks. -- Harsh J

Re: DFS respond very slow

2012-10-10 Thread Harsh J
regards Alexey -- Harsh J

Re: DFS respond very slow

2012-10-10 Thread Harsh J
...@gmail.com wrote: Hello Harsh, I notices such issues from the start. Yes, I mean dfs.balance.bandwidthPerSec property, I set this property to 500. On 10/09/12 11:50 PM, Harsh J wrote: Hey Alexey, Have you noticed this right from the start itself? Also, what exactly do you mean

Re: Reading Sequence File from Hadoop Distributed Cache ..

2012-10-10 Thread Harsh J
in the distribued cache?? Thank you, Mark -- Harsh J

Re: Secure hadoop and group permission on HDFS

2012-10-08 Thread Harsh J
is authenticated by the kerberos server. But what about the groups that the user is a member of? Are these simple the groups that the user is a member of on the namenode machine? Is it viable to manage access to files on HDFS using groups on a secure hadoop cluster? -- Harsh J

Re: Hadoopn1.03 There is insufficient memory for the Java Runtime Environment to continue.

2012-10-06 Thread Harsh J
a jar for a larger job but only running a version of wordcount that worked well under 0.2 Any bright ideas??? This is a new 1.03 installation and nothing is known to work Steven M. Lewis PhD 4221 105th Ave NE Kirkland, WA 98033 cell 206-384-1340 skype lordjoe_com -- Harsh J

Re: Job jar not removed from staging directory on job failure/how to share a job jar using distributed cache

2012-10-06 Thread Harsh J
solutions to my problem. I will look at Oozie. And worst case, I can create a FileSystem instance myself to check whether the job should be really launched or not. Both could work but both seem overkill in my context. -- Harsh J

Re: setJarByClass method semantics

2012-10-05 Thread Harsh J
-- Harsh J

Re: Counters that track the max value

2012-10-05 Thread Harsh J
to replace + with max and everything else should work? J On Wed, Oct 3, 2012 at 9:52 AM, Harsh J ha...@cloudera.com wrote: Jeremy, Here's my shot at it (pardon the quick crappy code): https://gist.github.com/3828246 Basically - you can achieve it in two ways: Requirement: All tasks must

Re: Chaning Multiple Reducers: Reduce - Reduce - Reduce

2012-10-05 Thread Harsh J
local node. I want to chain multiple reduce functions globally so the data flow looks like: Map - Reduce - Reduce - Reduce, which means each reduce operation is followed by a shuffle and sort essentially bypassing the map operation. -- Harsh J

Re: Classic(MapReduce 1) cluster in Hadoop 0.23 just won't listen

2012-10-03 Thread Harsh J
on port 9001. There are no errors in the logs, and no mention of that port, either. Obviously, all Map/Reduce examples fail with Connection Refused. Starting the same cluster using a MapReduce 2 (YARN) configuration works properly. Regards, Alexander -- Harsh J

Re: Lib conflicts

2012-10-03 Thread Harsh J
, but is not working on another VM. Replacing the 1.4 jar with the 1.7 does seem to fix the problem but this doesn't seem too sane. Hopefully there is a better alternative. Thanks! -- Harsh J

Re: Counters that track the max value

2012-10-03 Thread Harsh J
mappers or reducers? Thanks J -- Harsh J

Re: GenericOptionsParser

2012-10-03 Thread Harsh J
-- Harsh J

Re: Classic(MapReduce 1) cluster in Hadoop 0.23 just won't listen

2012-10-03 Thread Harsh J
the MR1 specific ideas I'd mentioned earlier. On Wed, Oct 3, 2012 at 12:08 PM, Harsh J ha...@cloudera.com wrote: Hi, The classic option exists to provide backward compatibility for users wanting to run an MR1 cluster (with JT, etc.). With the inclusion of YARN and MR2 modes of runtime, Apache

Re: Upgrade not finalized

2012-10-02 Thread Harsh J
previous and previous.checkpoint. It is very important that we here do not lose data. A backup is not possible for reasons of cost. Is there eventually an easy way to test it? Ulrich -- Harsh J

Re: Classic(MapReduce 1) cluster in Hadoop 0.23 just won't listen

2012-10-02 Thread Harsh J
. Regards, Alexander -- Harsh J

Re: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

2012-10-01 Thread Harsh J
) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) -- Harsh J

Re: Reduce Copy Speed

2012-10-01 Thread Harsh J
anyone else have other benchmark numbers to share? -- Harsh J

Re: doubts reg Hive

2012-09-30 Thread Harsh J
the Hadoop user lists here. -- Harsh J

Re: Report tool ISSUE.

2012-09-30 Thread Harsh J
start.sh file in /iReport-4.7.1/bin/ it has /iReport-4.7.1/bin/ireport.exe Please suggest me how to get it install over ubuntu. Please suggest me for the linux version Thanks Regards Yogesh Kumar -- Harsh J

Re: Pseudo distributed mode : How to increase no of concurrent map task

2012-09-29 Thread Harsh J
dfs.block.size = 64MB How to increase the the number of concurrent map task ? Thanks in advance for any assistance ! Shing -- Harsh J

Re: Use of CombineFileInputFormat

2012-09-28 Thread Harsh J
somebody elaborate ? -- Jay Vyas MMSB/UCHC -- Harsh J

Re: Securing cluster from access

2012-09-28 Thread Harsh J
firewall implemented yet outside cluster so that is not an option. Thanks in advance for your help -- Bertrand Dechoux Thanks and Regards , -- Harsh J

Re: Securing cluster from access

2012-09-28 Thread Harsh J
that only set of users or set of IPs should be able to see the HDFS. We dont have firewall implemented yet outside cluster so that is not an option. Thanks in advance for your help -- Bertrand Dechoux Thanks and Regards , -- Harsh J

Re: MultipleOutputs side effects

2012-09-28 Thread Harsh J
-- Harsh J

Re: Usefulness of ChainMapper/ChainReducer

2012-09-28 Thread Harsh J
and ChainReducer can be implemented with just a Mapper and a Reducer containing all the code of the respective chain-implementations. Or am I missing certain aspects about why they are more than just convenience concepts? Thanks for clarifying this! Sigurd -- Harsh J

Re: Can we write output directly to HDFS from Mapper

2012-09-27 Thread Harsh J
is prohibited. If you have received this electronic message in error, please notify the sender immediately and destroy the original message and all copies. -- Harsh J

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Harsh J
-communications? how did you solve this limitation of mapreduce? thanks, jane. -- Harsh J

Re: strategies to share information between mapreduce tasks

2012-09-26 Thread Harsh J
is inherently so dynamic, and is built for rapid streaming reads/writes, which would be stifled by significant communication overhead. -- Bertrand Dechoux -- Harsh J

Re: Amateur doubt about Terasort

2012-09-26 Thread Harsh J
on the same. Thanks in advance, Nitin -- Harsh J

Re: Cannot run program autoreconf

2012-09-25 Thread Harsh J
/Projects/hadoop-1.0.3/build.xml:618: Execute failed: java.io.IOException: Cannot run program autoreconf (in directory /home/xeon/Projects/hadoop-1.0.3/src/native): java.io.IOException: error=2, No such file or directory What this error means? -- Best regards, -- Harsh J

Re: Python + hdfs written thrift sequence files: lots of moving parts!

2012-09-25 Thread Harsh J
/UCHC -- Harsh J

Re: libhdfs install dep

2012-09-25 Thread Harsh J
prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. -- Harsh J

Re: Hadoop and Cuda , JCuda (CPU+GPU architecture)

2012-09-24 Thread Harsh J
Oleg. -- Harsh J

Re: How to set 2mappers on 1 job

2012-09-22 Thread Harsh J
| Software Engineer I | m: +94 719 258 242 | www.microsoft.com/enterprisesearch -- Harsh J

<    4   5   6   7   8   9   10   11   12   13   >