Even when I try to Store directly the loaded file it is generating only 5 output files.
The size of test.txt is 1 GB where as the putput folder size is 178 MB.

A= LOAD 'data/test.txt' USING PigStorage();
STORE A INTO 'output';


-----Original Message----- From: kiranprasad
Sent: Thursday, September 22, 2011 10:27 AM
To: Thejas Nair
Cc: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist


But when compared the number of records in output should be 12,600 but there
are only 2 records in Linux VM output folder.

Regards
Kiran.G

-----Original Message----- From: Thejas Nair
Sent: Wednesday, September 21, 2011 10:55 PM
To: kiranprasad
Cc: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

This is unlikely to be a configuration issue.
This query will result in a map-only job, and the number of part files
depends on the number of map tasks spawned. In typical configuration, in
pig mapreduce mode, it will be based on block size. Different number of
map tasks or part files should not cause a difference in results.

You might want to check for any difference in delimiters used in the
query. Having a look at the actual lines that are different might help
you figure out what is wrong.

Thanks,
Thejas



On 9/21/11 4:50 AM, kiranprasad wrote:
Hi

In windows system using Cygwin the out put I got were 35 files
(part-m-00001 - 00035) with the same log file xyz.txt (1 GB size) and
same filter

using CYGWIN (Master)
-----------
grunt> A= LOAD 'data/xyz.txt' USING PigStorage();
grunt> B= FILTER A BY ($0 matches '9948.*');
grunt> STORE B INTO 'data/output2';

using Linux VM (Master)
---------
used the same script in this VM in local mode and mapred mode only 5
files ((part-m-00001 - 00005) ) were generated as output and number of
records also does nt match.

grunt> A= LOAD 'data/DNDDB.txt' USING PigStorage();
grunt> B= FILTER A BY ($0 matches '9948.*');
grunt> STORE B INTO 'data/output2';

I think I missed some configurations !

Regards

Kiran.G

-----Original Message----- From: kiranprasad
Sent: Wednesday, September 21, 2011 4:58 PM
To: Thejas Nair ; user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

Now I am able to connect to HDFS and execute the PIG Latin scripts in
mapred
mode,
but when I compared the results with local mode and mapred mode they are
different.

Regards
Kiran.G

-----Original Message----- From: Thejas Nair
Sent: Wednesday, September 21, 2011 2:23 AM
To: user@pig.apache.org
Cc: kiranprasad
Subject: Re: ERROR 2118: Input path does not exist

The put command that Marek described can do that.
http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#put

You will need to have hadoop client on that machine or move data to a
machine that has it. Copying 10GB of data over a LAN (?) should not take
too long.

-Thejas


On 9/20/11 12:22 AM, kiranprasad wrote:
How can I LOAD a file which is in another machine, of 10 GB size.

-----Original Message----- From: Marek Miglinski
Sent: Tuesday, September 20, 2011 12:19 PM
To: user@pig.apache.org
Subject: RE: ERROR 2118: Input path does not exist

Hey,

'/data/test.txt' is supposed to be on hdfs (if your not executing with
-x local), put it there from your local drive with command:
hadoop fs -put

for ex, create dir and the put:
hadoop fs -mkdir /data
hadoop fs -put /data/test.txt /data/


Sincerely,
Marek M.
________________________________________
From: kiranprasad [kiranprasa...@imimobile.com]
Sent: Tuesday, September 20, 2011 7:47 AM
To: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

Hi Marek

I got the response as below

[kiranprasad.g@pig4 bin]$ ./hadoop fs -ls /
Found 1 items
drwxr-xr-x - kiranprasad.g supergroup 0 2011-09-19 19:23 /tmp
but after loading (A= LOAD '/data/test.txt' USING PigStorage();),
I am getting the same exception.

Message: org.apache.pig.backend.executionengine.
ExecException: ERROR 2118: Input path does not exist:
hdfs://10.0.0.61/data/msis
dns.txt
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInput
Format.getSplits(PigInputFormat.java:280)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:7
79)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobCont
rol.java:247)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:27
9)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
Input pa th does not
exist: hdfs://10.0.0.61/data/msisdns.txt
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(File
InputFormat.java:224)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextI
nputFormat.listStatus(PigTextInputFormat.java:36)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileI
nputFormat.java:241)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInput
Format.getSplits(PigInputFormat.java:268)
... 7 more


where is the default path of the data.txt configured ?

Regards
Kiran.G

-----Original Message-----
From: Marek Miglinski
Sent: Monday, September 19, 2011 3:14 PM
To: user@pig.apache.org
Subject: RE: ERROR 2118: Input path does not exist

hadoop fs -ls /

BACKSHASH at the end!

And:
A= LOAD '/data/test.txt' USING PigStorage();

BACKSLASH before data!


-----Original Message-----
From: kiranprasad [mailto:kiranprasa...@imimobile.com]
Sent: Monday, September 19, 2011 12:10 PM
To: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

Hi

I am unable t run the below mentioned command : hadoop fs -ls getting the
same output.

[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/hadoop fs -ls
ls: Cannot access .: No such file or directory.

Below is the exception.
Message: org.apache.pig.backend.executionengine.ExecException: ERROR
2118:
Input path does not exist:
hdfs://10.0.0.61/home/kiranprasad.g/pig-0.8.1/data/msisdns.txt

Regards
Kiran.G

-----Original Message-----
From: Marek Miglinski
Sent: Sunday, September 18, 2011 1:09 AM
To: user@pig.apache.org
Subject: RE: ERROR 2118: Input path does not exist

I meant that you should use absolute path when you load HDFS path from
PIG,
so this is not correct:
A = LOAD 'data/test.txt' USING PigStorage(); This is correct:
A= LOAD '/data/test.txt' USING PigStorage();

If you want to display contents of HDFS, type from terminal:
hadoop fs -ls /
To display first level structure.
hadoop fs -lsr /
To display all levels.


Sincerely,
Marek M.
________________________________________
From: kiranprasad [kiranprasa...@imimobile.com]
Sent: Saturday, September 17, 2011 8:46 AM
To: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

When I do hadoop fs -ls I am getting the below


[kiranprasad.g@pig4 ~]$ cd hadoop-0.20.2
[kiranprasad.g@pig4 hadoop-0.20.2]$ bin/hadoop fs -ls
ls: Cannot access .: No such file or directory.

Regards
Kiran.G

-----Original Message-----
From: Damien Hardy
Sent: Friday, September 16, 2011 8:34 PM
To: user@pig.apache.org
Subject: Re: ERROR 2118: Input path does not exist

What is the result of "hadoop fs -ls
hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt"

Regards,

--
Damien

Le 16/09/2011 17:04, kiranprasad a écrit :
Hi

I am getting the below mentioned exception after I load a file and do
Filter on it.
The file(test.txt) is saved inside PIG home/data/ folder.


grunt> A= LOAD 'data/test.txt' USING PigStorage(); B= FOREACH A
grunt> GENERATE $0; DUMP B;
2011-09-17 01:17:43,408 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN
2011-09-17 01:17:43,409 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-09-17 01:17:43,652 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
(Name: B:
Store(hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250:org.apache.pig
.impl.io.InterStorage)
- scope-4 Operator Key: scope-4)
2011-09-17 01:17:43,662 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompile
r - File concatenation threshold: 100 optimistic? false
2011-09-17 01:17:43,688 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQuer
yOptimizer
- MR plan size before optimization: 1
2011-09-17 01:17:43,689 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQuer
yOptimizer
- MR plan size after optimization: 1
2011-09-17 01:17:43,742 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
added to the job
2011-09-17 01:17:43,754 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobContro
lCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to
default
0.3
2011-09-17 01:17:46,447 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobContro
lCompiler
- Setting up single store job
2011-09-17 01:17:46,609 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduce
Launcher
- 1 map-reduce job(s) waiting for submission.
2011-09-17 01:17:47,525 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduce
Launcher
- 0% complete
2011-09-17 01:17:48,158 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduce
Launcher
- job null has failed! Stop running all dependent jobs
2011-09-17 01:17:48,162 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduce
Launcher
- 100% complete
2011-09-17 01:17:48,169 [main] ERROR
org.apache.pig.tools.pigstats.PigStats - ERROR 2997: Unable to
recreate exception from backend error:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
Input path does not exist:
hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
2011-09-17 01:17:48,173 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2011-09-17 01:17:48,174 [main] INFO
org.apache.pig.tools.pigstats.PigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt
Features
0.20.2 0.8.1 kiranprasad.g 2011-09-17 01:17:43 2011-09-17
01:17:48 UNKNOWN

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
N/A A,B MAP_ONLY Message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
Input path does not exist:
hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:280)


at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)


at
org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException:
Input path does not exist:
hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:224)


at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)


at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)


at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:268)


... 7 more
hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250,

Input(s):
Failed to read data from
"hdfs://10.0.0.61/user/kiranprasad.g/data/msisdn.txt"

Output(s):
Failed to produce result in
"hdfs://10.0.0.61/tmp/temp-754030090/tmp1617007250"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0 Total bags proactively
spilled: 0 Total records proactively spilled: 0

Job DAG:
null


2011-09-17 01:17:48,174 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduce
Launcher
- Failed!
2011-09-17 01:17:48,184 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1066: Unable to open iterator for alias B Details at logfile:
/home/kiranprasad.g/pig-0.8.1/pig_1316202429844.log

Any idea where am I making the mistake ?


Regards
Kiran.G











Reply via email to