MySQL format is in Mb. I run the following statement :
SELECT TABLE_NAME, table_rows, data_length, index_length, round(((data_length + 
index_length) / 1024 / 1024),2) 'Size in MB'....

Mahesh, good point! My Sqoop actually uses sequencefile for output format. Wow, 
it is pretty good space saving after all, sweet....

From: Mahesh Balija [mailto:[email protected]]
Sent: Thursday, November 29, 2012 10:31 AM
To: [email protected]
Subject: Re: discrepancy du in dfs are fs

Hi Andy,

       I am not very sure, but you can look what format (I mean bytes/kb/mb 
etc) your mysql size is in.
       Based on that you may conclude or may be mysql is storing some 
additional metadata which could be the reason for difference.

       One more possibility could be whether your HDFS data is 
compressed/sequence data.

Best,
Mahesh Balija,
Calsoft Labs.
On Thu, Nov 29, 2012 at 8:48 PM, Kartashov, Andy 
<[email protected]<mailto:[email protected]>> wrote:
I also show some discrepancy Sqoop'ing data from MySQL.  Both MySQL "select 
count(*)  from.." and "sqoop -eval -query "select count(*).."  return equal 
number of rows. But after importing the data into hdfs , hadoop fs -du shows 
imported data at roughly  1/2 the size of the actual table size in the MySQL 
DB.  Is that normal?

Cheers.


-----Original Message-----
From: "Christoph Böhm" 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Wednesday, November 28, 2012 3:10 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: discrepancy du in dfs are fs


You're right.
"du -b" returns the expected value.

Thanks.
Chris

-------- Original-Nachricht --------
> Datum: Wed, 28 Nov 2012 20:17:18 +0530
> Von: Mahesh Balija 
> <[email protected]<mailto:[email protected]>>
> An: [email protected]<mailto:[email protected]>
> Betreff: Re: discrepancy du in dfs are fs

> Hi Chris,
>
>           Can you try the following in your local machine,
>
>                du -b myfile.txt
>
>           and compare this with the hadoop fs -du myfile.txt.
>
> Best,
> Mahesh Balija,
> Calsoft Labs.
>
> On Wed, Nov 28, 2012 at 7:43 PM, 
> <[email protected]<mailto:[email protected]>> wrote:
>
> >
> > Hi all,
> >
> > I wonder wy there is a difference between "du" on HDFS and "get" + "du"
> on
> > my local machnine.
> >
> > Here is an example:
> >
> > hadoop fs -du myfile.txt
> > > 81355258
> >
> > hadoop fs -get myfile.txt .
> > du myfile.txt
> > > 34919
> >
> > --- nevertheless ---
> >
> > hadoop fs -cat  myfile.txt | wc -l
> > > 4789943
> >
> > cat myfile.txt | wc -l
> > > 4789943
> >
> >
> > Any idea?
> > Thanks.
> > Chris
> >
NOTICE: This e-mail message and any attachments are confidential, subject to 
copyright and may be privileged. Any unauthorized use, copying or disclosure is 
prohibited. If you are not the intended recipient, please delete and contact 
the sender immediately. Please consider the environment before printing this 
e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont 
confidentiels, protégés par le droit d'auteur et peuvent être couverts par le 
secret professionnel. Toute utilisation, copie ou divulgation non autorisée est 
interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, 
supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à 
l'environnement avant d'imprimer le présent courriel

NOTICE: This e-mail message and any attachments are confidential, subject to 
copyright and may be privileged. Any unauthorized use, copying or disclosure is 
prohibited. If you are not the intended recipient, please delete and contact 
the sender immediately. Please consider the environment before printing this 
e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont 
confidentiels, protégés par le droit d'auteur et peuvent être couverts par le 
secret professionnel. Toute utilisation, copie ou divulgation non autorisée est 
interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, 
supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à 
l'environnement avant d'imprimer le présent courriel

Reply via email to