Hi Konstantin,

Could you please create a jira item at: 
https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?


On July 2, 2014 at 11:45:24 PM, Konstantin Kudryavtsev 
(kudryavtsev.konstan...@gmail.com) wrote:

It sounds really strange...

I guess it is a bug, critical bug and must be fixed... at least some flag must 
be add (unable.hadoop)

I found the next workaround :
1) download compiled winutils.exe from 
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")

after that test runs

Thank you,
Konstantin Kudryavtsev

On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee <denny.g....@gmail.com> wrote:
You don't actually need it per se - its just that some of the Spark libraries 
are referencing Hadoop libraries even if they ultimately do not call them. When 
I was doing some early builds of Spark on Windows, I admittedly had Hadoop on 
Windows running as well and had not run into this particular issue.

On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev 
<kudryavtsev.konstan...@gmail.com> wrote:
No, I don’t

why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to 
read data from local filesystem

On Jul 2, 2014, at 9:10 PM, Denny Lee <denny.g....@gmail.com> wrote:

By any chance do you have HDP 2.1 installed? you may need to install the utils 
and update the env variables per 

On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev 
<kudryavtsev.konstan...@gmail.com> wrote:

Hi Andrew,

it's windows 7 and I doesn't set up any env variables here 

The full stack trace:

14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the 
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the 
Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

Thank you,
Konstantin Kudryavtsev

On Wed, Jul 2, 2014 at 8:15 PM, Andrew Or <and...@databricks.com> wrote:
Hi Konstatin,

We use hadoop as a library in a few places in Spark. I wonder why the path 
includes "null" though.

Could you provide the full stack trace?


2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev 

Hi all,

I'm trying to run some transformation on Spark, it works fine on cluster (YARN, 
linux machines). However, when I'm trying to run it on local machine (Windows 
7) under unit test, I got errors:

java.io.IOException: Could not locate executable null\bin\winutils.exe in the 
Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)

My code is following:

def testETL() = {
    val conf = new SparkConf()
    val sc = new SparkContext("local", "test", conf)
    try {
        val etl = new IxtoolsDailyAgg() // empty constructor

        val data = sc.parallelize(List("in1", "in2", "in3"))

        etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
    } finally {
        if(sc != null)

Why is it trying to access hadoop at all? and how can I fix it? Thank you in 

Thank you,
Konstantin Kudryavtsev

Reply via email to