I also show some discrepancy Sqoop'ing data from MySQL. Both MySQL "select count(*) from.." and "sqoop -eval -query "select count(*).." return equal number of rows. But after importing the data into hdfs , hadoop fs -du shows imported data at roughly 1/2 the size of the actual table size in the MySQL DB. Is that normal?
Cheers. -----Original Message----- From: "Christoph Böhm" [mailto:[email protected]] Sent: Wednesday, November 28, 2012 3:10 PM To: [email protected] Subject: Re: discrepancy du in dfs are fs You're right. "du -b" returns the expected value. Thanks. Chris -------- Original-Nachricht -------- > Datum: Wed, 28 Nov 2012 20:17:18 +0530 > Von: Mahesh Balija <[email protected]> > An: [email protected] > Betreff: Re: discrepancy du in dfs are fs > Hi Chris, > > Can you try the following in your local machine, > > du -b myfile.txt > > and compare this with the hadoop fs -du myfile.txt. > > Best, > Mahesh Balija, > Calsoft Labs. > > On Wed, Nov 28, 2012 at 7:43 PM, <[email protected]> wrote: > > > > > Hi all, > > > > I wonder wy there is a difference between "du" on HDFS and "get" + "du" > on > > my local machnine. > > > > Here is an example: > > > > hadoop fs -du myfile.txt > > > 81355258 > > > > hadoop fs -get myfile.txt . > > du myfile.txt > > > 34919 > > > > --- nevertheless --- > > > > hadoop fs -cat myfile.txt | wc -l > > > 4789943 > > > > cat myfile.txt | wc -l > > > 4789943 > > > > > > Any idea? > > Thanks. > > Chris > > NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le présent courriel et toute pièce jointe qui l'accompagne sont confidentiels, protégés par le droit d'auteur et peuvent être couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autorisée est interdite. Si vous n'êtes pas le destinataire prévu de ce courriel, supprimez-le et contactez immédiatement l'expéditeur. Veuillez penser à l'environnement avant d'imprimer le présent courriel
