Pig in Windows)

Krishnan K Sat, 12 Jul 2014 21:15:09 -0700

Hi,

I'm running a PigScript on my Windows machine. I don't have a hadoop/pig
environment installed.


Some questions :
1. Can I run PigUnit test cases in *Windows *without having any *hadoop*/*pig
environment setup *?
2. Can I run PigUnit testcases in *local *mode through eclipse if I can
configure the cluster details ? If yes, where can I provide my cluster
details ?
3. Can I run PigUnit testcases in *mapreduce *mode through eclipse if I can
configure the cluster details ? If yes, where can I provide my cluster
details ?
4. Can I build maven jar without running test cases in my Windows machine
and deploy them in a cluster having hadoop/pig ?

Appreciate your help.

I executed a pigunit test case and it errored out. Please find the log
below which has error details :

14/07/12 17:55:30 INFO pigunit.PigTest: Using default local mode
14/07/12 17:55:30 INFO executionengine.HExecutionEngine: Connecting to
hadoop file system at: file:///
14/07/12 17:55:30 INFO pigunit.PigTest: -- Load users from hdfs
users = LOAD 'src/test/resources/input/users.txt' USING PigStorage(',') AS
(id:long, firstName:chararray, lastName:chararray, country:chararray,
city:chararray, company:chararray);

-- Load ratings from hdfs
awesomenessRating = LOAD 'src/test/resources/input/rating.txt' USING
PigStorage(',') AS (userId:long, rating:long);

-- Join records by userId
joinedRecords = JOIN users BY id, awesomenessRating BY userId;

-- Filter users with awesomenessRating > 150
filteredRecords = FILTER joinedRecords BY awesomenessRating::rating > 150;

-- Generate fields that we are interested in
generatedRecords = FOREACH filteredRecords GENERATE
 users::id AS id,
users::firstName AS firstName,
 users::country AS country,
awesomenessRating::rating AS rating;

-- Store results
STORE generatedRecords INTO 'src/test/resources/results/awesomeness' USING
PigStorage();

14/07/12 17:55:30 INFO util.Utils: Default bootup file
C:\Users\krkrishnamoorthy/.pigbootup not found
users = LOAD 'src/test/resources/input/users.txt' USING PigStorage(',') AS
(id:long, firstName:chararray, lastName:chararray, country:chararray,
city:chararray, company:chararray);
--> users = LOAD 'src/test/resources/input/users.txt' USING PigStorage(',')
AS
(id:long,firstName:chararray,lastName:chararray,country:chararray,city:chararray,company:chararray);
awesomenessRating = LOAD 'src/test/resources/input/rating.txt' USING
PigStorage(',') AS (userId:long, rating:long);
 --> awesomenessRating = LOAD
'src/test/resources/input/awesomeness-rating.txt' USING PigStorage(',') AS
(userId:long, rating:long);
STORE generatedRecords INTO 'src/test/resources/results/awesomeness' USING
PigStorage();
--> none
14/07/12 17:55:31 INFO pigstats.ScriptState: Pig features used in the
script: HASH_JOIN
14/07/12 17:55:31 INFO optimizer.LogicalPlanOptimizer:
{RULES_ENABLED=[AddForEach, ColumnMapKeyPrune,
DuplicateForEachColumnRewrite, FilterLogicExpressionSimplifier,
GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer,
LoadTypeCastInserter, MergeFilter, MergeForEach,
NewPartitionFilterOptimizer, PartitionFilterOptimizer,
PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
14/07/12 17:55:31 INFO mapReduceLayer.MRCompiler: File concatenation
threshold: 100 optimistic? false
14/07/12 17:55:31 INFO
mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
POPackage->POForEach to POJoinPackage
14/07/12 17:55:31 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
before optimization: 1
14/07/12 17:55:31 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
after optimization: 1
14/07/12 17:55:31 INFO pigstats.ScriptState: Pig script settings are added
to the job
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler:
mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Key [pig.schematuple] is
false, will not generate code.
14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Starting process to move
generated code to distributed cache
14/07/12 17:55:31 INFO data.SchemaTupleFrontend: Distributed cache not
supported or needed in local mode. Setting key [pig.schematuple.local.dir]
with code temp directory:
C:\Users\KRKRIS~1\AppData\Local\Temp\1405212931260-0
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Reduce phase
detected, estimating # of required reducers.
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Using reducer
estimator:
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
14/07/12 17:55:31 INFO mapReduceLayer.InputSizeReducerEstimator:
BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=-1
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Could not
estimate number of reducers and no requested or default parallelism set.
Defaulting to 1 reducer.
14/07/12 17:55:31 INFO mapReduceLayer.JobControlCompiler: Setting
Parallelism to 1
14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce
job(s) waiting for submission.
14/07/12 17:55:31 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/07/12 17:55:31 ERROR security.UserGroupInformation:
PriviledgedActionException as:krkrishnamoorthy cause:java.io.IOException:
Failed to set permissions of path:
\tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
to 0700
14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 0% complete
14/07/12 17:55:31 WARN mapReduceLayer.MapReduceLauncher: Ooops! Some job
has failed! Specify -stop_on_failure if you want Pig to stop immediately on
failure.
14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: job null has
failed! Stop running all dependent jobs
14/07/12 17:55:31 INFO mapReduceLayer.MapReduceLauncher: 100% complete
14/07/12 17:55:31 WARN mapReduceLayer.Launcher: There is no log file to
write to.
14/07/12 17:55:31 ERROR mapReduceLayer.Launcher: Backend error message
during job submission
java.io.IOException: Failed to set permissions of path:
\tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
to 0700
 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
 at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
 at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
 at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
 at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
 at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
at
org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
 at
org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
at java.lang.Thread.run(Thread.java:744)
 at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)

14/07/12 17:55:31 ERROR pigstats.SimplePigStats: ERROR: Failed to set
permissions of path:
\tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
to 0700
14/07/12 17:55:31 ERROR pigstats.PigStatsUtil: 1 map reduce job(s) failed!
14/07/12 17:55:31 INFO pigstats.SimplePigStats: Detected Local mode. Stats
reported below may be incomplete
14/07/12 17:55:31 INFO pigstats.SimplePigStats: Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.2.1 0.12.0 krkrishnamoorthy 2014-07-12 17:55:31 2014-07-12 17:55:31
HASH_JOIN

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
N/A awesomenessRating,joinedRecords,users HASH_JOIN Message:
java.io.IOException: Failed to set permissions of path:
\tmp\hadoop-krkrishnamoorthy\mapred\staging\krkrishnamoorthy502928296\.staging
to 0700
 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
 at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
 at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
 at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
 at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
 at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
at
org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:483)
at
org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
 at
org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
at java.lang.Thread.run(Thread.java:744)
 at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)
 file:/tmp/temp49116140/tmp1118481539,

Input(s):
Failed to read data from
"file:///C:/Users/krkrishnamoorthy/workspace/test/pig-unit-example/src/test/resources/input/awesomeness-rating.txt"
Failed to read data from
"file:///C:/Users/krkrishnamoorthy/workspace/test/pig-unit-example/src/test/resources/input/users.txt"

Output(s):
Failed to produce result in "file:/tmp/temp49116140/tmp1118481539"

Job DAG:
null

14/07/12 17:55:32 INFO mapReduceLayer.MapReduceLauncher: Failed!


Thanks,
Krishnan

Error : PigUnit in Windows->Eclipse (without Hadoop/Pig in Windows)

Reply via email to