Michael was correct - thank you! Once my directory had been added to livy.file.local-dir-whitelist the POST /batches request worked using just file:/foo/bar/hello.py for the file argument. No pyFiles arg was required.
One subtlety was that the Livy server was not running on the same machine as the Spark master node. And it appears the local directory has to be a directory on the Livy server itself. If I just put the file on the Spark master node I got: python: can't open file '/foo/bar/hello.py': [Errno 2] No such file or directory java.lang.Exception: spark-submit exited with code 2}. So is this true of all the file-related args in the POST /batches call then? i.e., file, jars, pyFiles and files must all reference paths on the Livy server itself? I’m currently using Livy 0.3. Thanks, Lucas. From: Partridge, Lucas (GE Aviation) Sent: 06 November 2017 16:31 To: user@livy.incubator.apache.org Subject: EXT: POST /batches failing with "Only local python files are supported" Michael – thanks for your suggestion. At the moment any request I make with a file:/ argument immediately gets rejected as a ‘400 Bad Request’ but I’ve asked for livy.file.local-dir-whitelist to be modified in case that’ll make a difference. Unfortunately I don’t administer the Livy server so I’ll have to wait on that… Alex – I tried the pyFiles arg with both the hdfs: and file: path but it made no difference to the “Only local python files are supported” error. Feel free to research it more but I would expect the purpose of the pyFiles argument to be to supply any missing Python libraries that the Python application supplied in the file arg may depend on. Thanks, Lucas. From: Alex Bozarth [mailto:ajboz...@us.ibm.com] Sent: 03 November 2017 20:11 To: user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org> Subject: EXT: Re: POST /batches failing with "Only local python files are supported" Michael's solution might do it for you, but if it doesn't I would try the pyFiles arg instead, I haven't used it but it's purpose may be to fix this exact issue. I can research it more if needed. Alex Bozarth Software Engineer Spark Technology Center ________________________________ E-mail: ajboz...@us.ibm.com<mailto:ajboz...@us.ibm.com> GitHub: github.com/ajbozarth<https://github.com/ajbozarth> 505 Howard Street San Francisco, CA 94105 United States [Inactive hide details for Michael Rhee ---11/03/2017 07:41:35 AM---Hi Lucas, I believe you need to modify your Livy configurati]Michael Rhee ---11/03/2017 07:41:35 AM---Hi Lucas, I believe you need to modify your Livy configuration file to allow access to a local direc From: Michael Rhee <mr...@gogoair.com<mailto:mr...@gogoair.com>> To: "user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>" <user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>> Date: 11/03/2017 07:41 AM Subject: Re: POST /batches failing with "Only local python files are supported" ________________________________ Hi Lucas, I believe you need to modify your Livy configuration file to allow access to a local directory on your master node. Something like the following: livy.file.local-dir-whitelist =/home/hadoop-user/ Then use the file:/home/hadoop-user/file argument when you pass your request to Livy. Hope that helps. Best, Michael From: "Partridge, Lucas (GE Aviation)" <lucas.partri...@ge.com<mailto:lucas.partri...@ge.com>> Reply-To: "user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>" <user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>> Date: Friday, November 3, 2017 at 5:34 AM To: "user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>" <user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>> Subject: POST /batches failing with "Only local python files are supported" Thanks for the suggestion Alex. However whenever I try anything beginning with file:/, file:// or file:/// or file://NNPRDHA/<file:///NNPRDHA/> I get this error: org.springframework.web.client.RestClientException: Error running rest call; nested exception is org.springframework.web.client.HttpClientErrorException: 400 Bad Request It sounds like from what you’ve said it’s a Spark error rather than a Livy error, which I didn’t realise before. But whenever I put in a file argument value without a file: in front it - Livy or Spark? - assumes it’s an HDFS path and prepends it with hdfs://. Then Spark complains that only local Python files are supported. I’ve also tried copying the Python file from hdfs to the local file system of the Spark Master node. But I can’t specify that path in my POST call because if I use file: I get 400 Bad Request; and if I don’t use file: it (Spark?) assumes it’s in hdfs! Should I use the files argument for POST /batches too? Or the pyFiles argument, although I assumed that was for Python libraries required by the main application. I’ve tried lots of combinations but none have worked so far. From: Alex Bozarth [mailto:ajboz...@us.ibm.com] Sent: 02 November 2017 21:34 To: user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org> Subject: EXT: Re: POST /batches failing with "Only local python files are supported" It sounds like you're passing in a local past and it's being treated as an HDFS path. Have you tired passing the path in with file:// at the front (similar to hdfs://) that tells hdfs that the path is local, I've run into this issue with Spark before. Alex Bozarth Software Engineer Spark Technology Center ________________________________ E-mail: ajboz...@us.ibm.com<mailto:ajboz...@us.ibm.com> GitHub: github.com/ajbozarth<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ajbozarth&d=DwMGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=r8pN3Zj5uNwAAQKOlGa68bNHE6RQSAIRPJ7nYhXV76A&s=3ZzlsLWIdaiBmKYONNccTVS0ZH4-fTwkxHEW87QdU9k&e=> 505 Howard Street San Francisco, CA 94105 United States [nactive hide details for "Partridge, Lucas (GE Aviation)" ---11/02/2017 0]"Partridge, Lucas (GE Aviation)" ---11/02/2017 07:14:58 AM---I want to use Livy (0.3) to run a Python file that I've placed in HDFS. I'm invoking POST /batches From: "Partridge, Lucas (GE Aviation)" <lucas.partri...@ge.com<mailto:lucas.partri...@ge.com>> To: "user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>" <user@livy.incubator.apache.org<mailto:user@livy.incubator.apache.org>> Date: 11/02/2017 07:14 AM Subject: POST /batches failing with "Only local python files are supported" ________________________________ I want to use Livy (0.3) to run a Python file that I’ve placed in HDFS. I’m invoking POST /batches from a Java REST client, passing in the path to the HDFS file as the ‘file’ argument of the POST request’s body (https://github.com/apache/incubator-livy/tree/branch-0.3#request-body-2<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dlivy_tree_branch-2D0.3-23request-2Dbody-2D2&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=YIUsDYuvjVmUUxeP7zjxNhpuBW6QdaXW6qnaEbUiGf0&s=6Za80aGx9OBp6GXr8Ze8Di9eqsBXKDD4TraM-8HdumA&e=>). The value I’m providing for ‘file’ is "/user/MyUserName/hello.py". The POST response says the batch is in state ‘starting’ but when I query it using GET /batches/{batchId} I see this: Error: Only local python files are supported: hdfs://NNPRDHA/user/MyUserName/hello.py You can see the value has been altered from what I provided. How do I successfully invoke the Python file please? (I see someone raised a similar problem at https://groups.google.com/a/cloudera.org/d/msg/livy-user/6AZeqtVwipg/U46tUjqNBwAJ<https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_a_cloudera.org_d_msg_livy-2Duser_6AZeqtVwipg_U46tUjqNBwAJ&d=DwMFAg&c=jf_iaSHvJObTbx-siA1ZOg&r=S1_S7Dymu4ZL6g7L21O78VQZ53vEnAyZ-cx37DPYDyo&m=YIUsDYuvjVmUUxeP7zjxNhpuBW6QdaXW6qnaEbUiGf0&s=FIU8MiykO7n_e8W17PX0vcMILUPvRR8Ga9vYr9DlSW0&e=> but it’s not clear if or how they solved it.) Thanks, Lucas.