Hi Dong,
HADOOP_CONF_DIR might be referring to default..you can export HADOOP_CONF_DIR
where following configuration files are present..
Thanks & Regards
Brahma Reddy Battula
From: Dan Dong [dongda...@gmail.com]
Sent: Saturday, December 13, 2014 3:43 AM
To: u
Hi,
I installed Hadoop2.6.0 on my cluster with 2 nodes, I got the following
error when I run:
$hadoop dfsadmin -report
FileSystem file:/// is not a distributed file system
What this mean? I have set it in core-site.xml already:
fs.defaultFS
hdfs://master-node:9000
and in hdfs-site.xml:
I solved the problem by changing the hosts file as follows:
10.10.0.10 10.5.0.10 yngcr10nc01
Thanks,
Fei
> On Nov 11, 2014, at 11:58 AM, daemeon reiydelle wrote:
>
> You may want to consider configuring host names that embed the subnet in the
> host name itself (e.g. foo50, foo40, for foo vi
Hi,
from a machine learning perspective I would recommend this approach, too
... if there is no other information available which splits the data
set. Depends on the data you are processing.
And I would split the data persistently, e.g. not using the train data
directly, but writing it into a fil
The remaining cluster services will continue to run. That way when the
namenode (or other failed processes) is restored the cluster will resume
healthy operation. This is part of hadoop’s ability to handle network
partition events.
Rich Haase | Sr. Software Engineer | Pandora
m 303.887.1146 |
Try Cascading multitool: http://docs.cascading.org/multitool/2.6/
- André
On Fri, Dec 12, 2014 at 10:30 AM, unmesha sreeveni
wrote:
> I am trying to divide my HDFS file into 2 parts/files
> 80% and 20% for classification algorithm(80% for modelling and 20% for
> prediction)
> Please provide sug
How about doing something on the lines of bucketing: Pick a field that is
unique for each record and if hash of the field mod 10 is 8 or less it goes
in one bin, otherwise into the other one.
Cheers
Chris
On Dec 12, 2014 1:32 AM, "unmesha sreeveni" wrote:
> I am trying to divide my HDFS file into
Hi Unmesha
With the random approach you don't need to write the MR job for counting.
Mikael.s
-Original Message-
From: "Hitarth"
Sent: 12/12/2014 15:20
To: "user@hadoop.apache.org"
Subject: Re: Split files into 80% and 20% for building model and prediction
Hi Unmesha,
If you us
Hi Unmesha,
If you use the approach suggested by Mikael of taking the random 80% of data
for training and rest for testing then you can have good distribution to
generate your predictive model.
Thanks,
Hitarth
> On Dec 12, 2014, at 6:00 AM, unmesha sreeveni wrote:
>
> Hi Mikael
> So you w
HI Team,
In my project we need to get meta data information (i.e column
name,datatype,etc) from RDBMS using sqoop.
Based on below link, i come to know that, we can get meta data using java
API. Is the any way we to get meta data information from the command line?
http://stackoverflow.com/questio
Hi Mikael
So you wont write an MR job for counting the number of records in that
file to find 80% and 20%
On Fri, Dec 12, 2014 at 3:54 PM, Mikael Sitruk
wrote:
>
> I would use a different approach. For each row in the mapper I would have
> invoked random.Next() then if the number generated by ra
Hi,
What happens if name node has crashed for more than one hour but secondary
name node, all the data nodes, job tracker, task trackers are running fine?
Do those daemon services also automatically shutdown after some time? Or
those services keep running hoping for namenode to come back?
Regards
I would use a different approach. For each row in the mapper I would have
invoked random.Next() then if the number generated by random is below 0.8 then
the row would go to key for training otherwise go to key for the test.
Mikael.s
-Original Message-
From: "Susheel Kumar Gadalay"
Sent:
Simple solution..
Copy the HDFS file to local and use OS commands to count no of lines
cat file1 | wc -l
and cut it based on line number.
On 12/12/14, unmesha sreeveni wrote:
> I am trying to divide my HDFS file into 2 parts/files
> 80% and 20% for classification algorithm(80% for modelling a
Le 12 déc. 2014 à 03:13, Vinod Kumar Vavilapalli a
écrit :
> Auth to local mappings
> - nn/nn-h...@cluster.com -> hdfs
> - dn/.*@cluster.com -> hdfs
>
> The combination of the above lets you block any other user other than hdfs
> from faking like a datanode.
>
> Purposes
> - _HOST: Let you d
I am trying to divide my HDFS file into 2 parts/files
80% and 20% for classification algorithm(80% for modelling and 20% for
prediction)
Please provide suggestion for the same.
To take 80% and 20% to 2 seperate files we need to know the exact number of
record in the data set
And it is only known if
16 matches
Mail list logo