It sounds really strange...

I guess it is a bug, critical bug and must be fixed... at least some flag
must be add (unable.hadoop)

I found the next workaround :
1) download compiled winutils.exe from
http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")

after that test runs

Thank you,
Konstantin Kudryavtsev


On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <denny.g....@gmail.com> wrote:

> You don't actually need it per se - its just that some of the Spark
> libraries are referencing Hadoop libraries even if they ultimately do not
> call them. When I was doing some early builds of Spark on Windows, I
> admittedly had Hadoop on Windows running as well and had not run into this
> particular issue.
>
>
>
> On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev <
> kudryavtsev.konstan...@gmail.com> wrote:
>
>> No, I don't
>>
>> why do I need to have HDP installed? I don't use Hadoop at all and I'd
>> like to read data from local filesystem
>>
>> On Jul 2, 2014, at 9:10 PM, Denny Lee <denny.g....@gmail.com> wrote:
>>
>> By any chance do you have HDP 2.1 installed? you may need to install the
>> utils and update the env variables per
>> http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
>>
>>
>> On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev <
>> kudryavtsev.konstan...@gmail.com> wrote:
>>
>> Hi Andrew,
>>
>> it's windows 7 and I doesn't set up any env variables here
>>
>> The full stack trace:
>>
>> 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in
>> the hadoop binary path
>> java.io.IOException: Could not locate executable null\bin\winutils.exe in
>> the Hadoop binaries.
>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>  at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>  at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>  at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
>> at
>> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
>>  at
>> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>> at
>> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>>  at
>> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>> at
>> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>>  at
>> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
>>  at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
>> at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>  at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>>  at junit.framework.TestCase.runTest(TestCase.java:168)
>> at junit.framework.TestCase.runBare(TestCase.java:134)
>>  at junit.framework.TestResult$1.protect(TestResult.java:110)
>> at junit.framework.TestResult.runProtected(TestResult.java:128)
>>  at junit.framework.TestResult.run(TestResult.java:113)
>> at junit.framework.TestCase.run(TestCase.java:124)
>>  at junit.framework.TestSuite.runTest(TestSuite.java:232)
>> at junit.framework.TestSuite.run(TestSuite.java:227)
>>  at
>> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
>> at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
>>  at
>> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
>> at
>> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
>>  at
>> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>  at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>  at java.lang.reflect.Method.invoke(Method.java:606)
>> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
>>
>>
>> Thank you,
>> Konstantin Kudryavtsev
>>
>>
>> On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <and...@databricks.com> wrote:
>>
>>> Hi Konstatin,
>>>
>>> We use hadoop as a library in a few places in Spark. I wonder why the
>>> path includes "null" though.
>>>
>>> Could you provide the full stack trace?
>>>
>>> Andrew
>>>
>>>
>>> 2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
>>> kudryavtsev.konstan...@gmail.com>:
>>>
>>> Hi all,
>>>>
>>>> I'm trying to run some transformation on *Spark*, it works fine on
>>>> cluster (YARN, linux machines). However, when I'm trying to run it on local
>>>> machine (*Windows 7*) under unit test, I got errors:
>>>>
>>>> java.io.IOException: Could not locate executable null\bin\winutils.exe in 
>>>> the Hadoop binaries.
>>>> at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
>>>> at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
>>>> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
>>>> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>>>> at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
>>>>
>>>>
>>>> My code is following:
>>>>
>>>> @Test
>>>> def testETL() = {
>>>>     val conf = new SparkConf()
>>>>     val sc = new SparkContext("local", "test", conf)
>>>>     try {
>>>>         val etl = new IxtoolsDailyAgg() // empty constructor
>>>>
>>>>         val data = sc.parallelize(List("in1", "in2", "in3"))
>>>>
>>>>         etl.etl(data) // rdd transformation, no access to SparkContext or 
>>>> Hadoop
>>>>         Assert.assertTrue(true)
>>>>     } finally {
>>>>         if(sc != null)
>>>>             sc.stop()
>>>>     }
>>>> }
>>>>
>>>>
>>>> Why is it trying to access hadoop at all? and how can I fix it? Thank
>>>> you in advance
>>>>
>>>> Thank you,
>>>> Konstantin Kudryavtsev
>>>>
>>>
>>>
>>
>>
>

Reply via email to