Hi,
Hdfs can be better compared with something like ext3 then with mysql. You can
use the "hadoop fs" to look at the files on hdfs just like you would look at
the "/mysql" dir on ext3. Hdfs internally splits these files into chunks of 64M
(configurable) and each chunk will end up on the underlying linux filesystem.
You can configure the location of these chunks in a config file called
hdfs-site.xml with a property called "dfs.data.dir" which defaults to
"${hadoop.tmp.dir}/dfs/data"
I doubt there are many use cases where looking at these individual chunks is
useful tough.
If you are interested to see how much space something is using use something
like this:
hadoop fs -du /user/hive/warehouse/
Just keep in mind if you have a replication factor of 3 on your setup it means
you are using 3x the physical space the -du command is telling you (roughly).
I hope that helps.
Bennie.
________________________________
From: vaibhav negi [mailto:[email protected]]
Sent: Tuesday, July 27, 2010 8:04 AM
To: [email protected]
Subject: Re: HIVE: How to Load CSV File?
Hi ,
By actual physical path , i mean full path in linux / directory. Like for
mysql, there is /mysql directory .
Inside it i can see files for individual tables and also can see what lies
inside those files.
Vaibhav Negi
2010/7/26 Alex Rovner <[email protected]<mailto:[email protected]>>
Hadoop fs -du command will show you the size of the files. What do you mean by
physical?
Sent from my iPhone
On Jul 26, 2010, at 6:43 AM, "vaibhav negi"
<[email protected]<mailto:[email protected]>> wrote:
Hi,
Hadoop -dfs command show logical path /user/hive/warehouse. How can i see where
this directory exists physically ?
Vaibhav Negi
On Mon, Jul 26, 2010 at 2:45 PM, Amogh Vasekar
<<mailto:[email protected]>[email protected]<mailto:[email protected]>>
wrote:
Hi,
The default HWI (hive web interface) provides some basic metadata, but don't
think file sizes are included. In any case, you can query using the common
hadoop dfs commands. The default warehouse directory is as set in your hive
conf xml.
Amogh
On 7/26/10 2:30 PM, "vaibhav negi"
<<http://[email protected]>[email protected]<mailto:[email protected]>>
wrote:
Hi,
Thanks amogh.
How can i browse actual physical location of hive tables juts like i see mysql
tables in mysql directory. I want to check actual disk space consumed by hive
tables.
Vaibhav Negi
On Mon, Jul 26, 2010 at 1:55 PM, Amogh Vasekar
<<http://[email protected]>[email protected]<mailto:[email protected]>>
wrote:
Hi,
You can create an external table pointing to data already on hdfs and
specifying the delimiter-
CREATE EXTERNAL TABLE page_view_stg(viewTime INT, userid BIGINT,
page_url STRING, referrer_url STRING,
ip STRING COMMENT 'IP Address of the User',
country STRING COMMENT 'country of origination')
COMMENT 'This is the staging page view table'
ROW FORMAT DELIMITED FIELDS TERMINATED BY '44' LINES TERMINATED BY '12'
STORED AS TEXTFILE
LOCATION '/user/data/staging/page_view';
<http://wiki.apache.org/hadoop/Hive/Tutorial#Creating_Tables>http://wiki.apache.org/hadoop/Hive/Tutorial#Creating_Tables
for more
HTH,
Amogh
On 7/26/10 1:02 PM, "vaibhav negi"
<<http://[email protected]>[email protected]<mailto:[email protected]>
<<http://[email protected]>http://[email protected]> > wrote:
Hi,
Is there some way to load csv file into hive?
Vaibhav Negi