Cos,
I understand that there are rules. What are these rules? Is it Hive vs hadoop (
this I understand) or apache hadoop vs a specific distribution? ( this I am
not clear about.)
Sent from my iPad
Please excuse the typos.
On Feb 12, 2013, at 8:56 PM, Konstantin Boudnik wrote:
> With all d
Hi Harsh,
Thanks a lot for your reply and great suggestions.
In the practical cases, the values usually do not reside in the same
data node. Instead, they are mostly distributed by the key range
itself. So, it does require 20G of memory, but distributed in
different nodes.
The MapFile solution is
Hi Harsh,
Thanks for moving the post to the correct list.
William
On Wed, Feb 13, 2013 at 12:29 AM, Harsh J wrote:
> Please do not use the general@ lists for any user-oriented questions.
> Please redirect them to user@hadoop.apache.org lists, which is where
> the user community and questions li
My reply to your questions is inline.
On Wed, Feb 13, 2013 at 10:59 AM, Harsh J wrote:
> Please do not use the general@ lists for any user-oriented questions.
> Please redirect them to user@hadoop.apache.org lists, which is where
> the user community and questions lie.
>
> I've moved your post th
Please do not use the general@ lists for any user-oriented questions.
Please redirect them to user@hadoop.apache.org lists, which is where
the user community and questions lie.
I've moved your post there and have added you on CC in case you
haven't subscribed there. Please reply back only to the u
With all due respect, sir, these mailing lists have certain rules, that aren't
evidently coincide with your philosophy.
Cos
On Tue, Feb 12, 2013 at 08:45PM, Raj Vishwanathan wrote:
> Arun
>
> I don't understand your reply! ═Had you redirected this person to the hive
> mailing list I would have
Arun
I don't understand your reply! Had you redirected this person to the hive
mailing list I would have understood..
My philosophy any mailing list has always been ; If I know the answer to a
question, I reply.. Else I humbly walk away!.
I got a lot of help from this group for my (mostly
Pls don't cross-post, this belong only to cdh lists.
On Feb 12, 2013, at 12:55 AM, samir das mohapatra wrote:
>
>
> Hi All,
>I wanted to know how to connect Hive(hadoop-cdh4 distribution) with
> MircoStrategy
>Any help is very helpfull.
>
> Witing for you response
>
> Note: It is li
Can you please include the complete stack trace and not just the root.
Also, have you set fs.default.name to a hdfs location like
hdfs://localhost:9000 ?
Thanks
Hemanth
On Wednesday, February 13, 2013, Alex Thieme wrote:
> Thanks for the prompt reply and I'm sorry I forgot to include the
> excep
Hello
Can someone share some idea what the Hadoop source code of class
org.apache.hadoop.io.compress.BlockDecompressorStream, method
rawReadInt() is trying to do here?
The BlockDecompressorStream class is used for block-based decompression
(e.g. snappy). Each chunk has a header indicating h
--
http://hortonworks.com/download/
Thanks for the prompt reply and I'm sorry I forgot to include the exception. My
bad. I've included it below. There certainly appears to be a server running on
localhost:9001. At least, I was able to telnet to that address. While in
development, I'm treating the server on localhost as the remote
conf.set("mapred.job.tracker", "localhost:9001");
this means that your jobtracker is on port 9001 on localhost
if you change it to the remote host and thats the port its running on then
it should work as expected
whats the exception you are getting?
On Wed, Feb 13, 2013 at 2:41 AM, Alex Thieme
I apologize for asking what seems to be such a basic question, but I would use
some help with submitting a job to a remote server.
I have downloaded and installed hadoop locally in pseudo-distributed mode. I
have written some Java code to submit a job.
Here's the org.apache.hadoop.util.Tool an
Can someone share some idea what the Hadoop source code of class
org.apache.hadoop.io.compress.BlockDecompressorStream, method rawReadInt() is
trying to do here?
There is a comment in the code this this method shouldn't return negative
number, but in my testing file, it contains the following b
No, Yong, I believe you misunderstood. David's explanation makes sense. As
pointed out in my original email, everything is going to 1 Mapper. It's
not creating multiple mappers.
BTW, the code given in my original email, indeed works as expected. It
does trigger multiple mappers, but it doesn't
On Tue, Feb 12, 2013 at 11:43 PM, Robert Molina wrote:
> to do it, there should be some information he
this is best way to remove data node from a cluster. you have done the
right thing.
∞
Shashwat Shriparv
The decommissioning process is controlled by an exclude file, which for
HDFS is set by the* dfs.hosts.exclude* property, and for MapReduce by
the*mapred.hosts.exclude
* property. In most cases, there is one shared file,referred to as the
exclude file.This exclude file name should be specified a
Hi,
I would like to add another scenario. What are the steps for removing a
dead node when the server had a hard failure that is unrecoverable.
Thanks,
Ben
On Tuesday, February 12, 2013 7:30:57 AM UTC-8, sudhakara st wrote:
>
> The decommissioning process is controlled by an exclude file, which
Hi Dhanasekaran,
I believe you are trying to ask if it is recommended to use the
decommissioning feature to remove datanodes from your cluster, the answer
would be yes. As far as how to do it, there should be some information
here http://wiki.apache.org/hadoop/FAQ that should help.
Regards,
Rober
With the help of Costin I got a running Maven configuration.
Thank you :).
This is a pom.xml for Spring Data Hadoop and CDH4:
http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.ap
Hi, Davie:
I am not sure I understand this suggestion. Why smaller block size will help
this performance issue?
>From what the original question about, it looks like the performance problem
>is due to that there are a lot of small files, and each file will run in its
>own mapper.
As hadoop nee
I don't think you can get list of all input files in the mapper, but what you
can get is the current file's information.
In the context object reference, you can get the InputSplit(), which should
give you all the information you want of the current input file.
http://hadoop.apache.org/docs/r2.0
Hi,
Could you first try running the example:
$ /usr/bin/hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar
grep input output 'dfs[a-z.]+'
Do you receive the same error?
Not sure if it's related to a lack of RAM, but as the stack trace shows
errors with "network" timeout (I r
Hi,
For Spring Data Hadoop problems, it's best to use the designated forum
[1]. These being said I've tried to reproduce your error but I can't -
I've upgraded the build to CDH 4.1.3 which runs fine against the VM on
the CI (4.1.1).
Maybe you have some other libraries on the client classpath?
Hi all,
I installed a redhat_enterprise-linux-x86 in VMware Workstation, and set the
virtual machine 1G memory.
Then I followed steps guided by "Installing CDH4 on a Single Linux Node in
Pseudo-distributed Mode" ——
https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Nod
Hi,
I try to use Spring Data Hadoop with CDH4 to write a Map Reduce Job.
On startup, I get the following exception:
Exception in thread "SimpleAsyncTaskExecutor-1"
java.lang.ExceptionInInitializerError
at
org.springframework.data.hadoop.mapreduce.JobExecutor$2.run(JobExecutor.java:183)
Hi Vikas,
You can get the FileSystem instance by calling
FileSystem.get(Configuration);
Once you get the FileSystem instance you can use
FileSystem.listStatus(InputPath); to get the fileStatus instances.
Best,
Mahesh Balija,
Calsoft Labs.
On Tue, Feb 12, 2013 at
Hi All,
I wanted to know how to connect Hive(hadoop-cdh4 distribution) with
MircoStrategy
Any help is very helpfull.
Witing for you response
Note: It is little bit urgent do any one have exprience in that
Thanks,
samir
29 matches
Mail list logo