When you call library(), that is the library loading function in native R. As of now it does not support HDFS but there are several packages out there that might help.
Another approach is to have a prefetch/installation mechanism to call HDFS command to download the R package from HDFS onto the worker node first. _____________________________ From: Senthil Kumar <senthilec...@gmail.com<mailto:senthilec...@gmail.com>> Sent: Wednesday, August 17, 2016 2:23 AM Subject: Spark R - Loading Third Party R Library in YARN Executors To: Senthil kumar <senthilec...@gmail.com<mailto:senthilec...@gmail.com>>, <du...@ebay.com<mailto:du...@ebay.com>>, <jiaj...@ebay.com<mailto:jiaj...@ebay.com>>, <dev@spark.apache.org<mailto:dev@spark.apache.org>> Hi All , We are using Spark 1.6 Version R library .. Below is our code which Loads the THIRD Party Library . library("BreakoutDetection", lib.loc = "hdfs://xxxxxx/BreakoutDetection/") : library("BreakoutDetection", lib.loc = "//xxxxxx/BreakoutDetection/") : When i try to execute the code using LOCAL Mode , Spark R code is Working fine without any issue . If i submit the Job in Cluster , we will end up with error. error in evaluating the argument 'X' in selecting a method for function 'lapply': Error in library("BreakoutDetection", lib.loc = "hdfs://xxxxxxx/BreakoutDetection/") : no library trees found in 'lib.loc' Calls: f ... lapply -> FUN -> mainProcess -> angleValid -> library Can't we read libraries in R as below ? library("BreakoutDetection", lib.loc = "hdfs://xxxxxx/BreakoutDetection/") : If not what is the other way to solve this problem ? Since our cluster having close to 2500 nodes we cant copy the Third Party Libs to all nodes .. Copying to all DNs is not good practice too .. Can someone help me here How to load R libs from HDFS or any other way ? --Senthil