Tried adding 50010, 50020 and 50090. Still no difference. I can't imagine I'm the only person on the planet wanting to do this.
Anyway, thanks for trying to help. Dino. On 25 August 2015 at 08:22, Roberto Congiu <roberto.con...@gmail.com> wrote: > Port 8020 is not the only port you need tunnelled for HDFS to work. If you > only list the contents of a directory, port 8020 is enough... for instance, > using something > > val p = new org.apache.hadoop.fs.Path("hdfs://localhost:8020/") > val fs = p.getFileSystem(sc.hadoopConfiguration) > fs.listStatus(p) > > you should see the file list. > But then, when accessing a file, you need to actually get its blocks, it has > to connect to the data node. > The error 'could not obtain block' means it can't get that block from the > DataNode. > Refer to > http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_reference/content/reference_chap2_1.html > to see the complete list of ports that also need to be tunnelled. > > > > 2015-08-24 13:10 GMT-07:00 Dino Fancellu <d...@felstar.com>: >> >> Changing the ip to the guest IP address just never connects. >> >> The VM has port tunnelling, and it passes through all the main ports, >> 8020 included to the host VM. >> >> You can tell that it was talking to the guest VM before, simply >> because it said when file not found >> >> Error is: >> >> Exception in thread "main" org.apache.spark.SparkException: Job >> aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most >> recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): >> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: >> BP-452094660-10.0.2.15-1437494483194:blk_1073742905_2098 >> file=/tmp/people.txt >> >> but I have no idea what it means by that. It certainly can find the >> file and knows it exists. >> >> >> >> On 24 August 2015 at 20:43, Roberto Congiu <roberto.con...@gmail.com> >> wrote: >> > When you launch your HDP guest VM, most likely it gets launched with NAT >> > and >> > an address on a private network (192.168.x.x) so on your windows host >> > you >> > should use that address (you can find out using ifconfig on the guest >> > OS). >> > I usually add an entry to my /etc/hosts for VMs that I use often....if >> > you >> > use vagrant, there's also a vagrant module that can do that >> > automatically. >> > Also, I am not sure how the default HDP VM is set up, that is, if it >> > only >> > binds HDFS to 127.0.0.1 or to all addresses. You can check that with >> > netstat >> > -a. >> > >> > R. >> > >> > 2015-08-24 11:46 GMT-07:00 Dino Fancellu <d...@felstar.com>: >> >> >> >> I have a file in HDFS inside my HortonWorks HDP 2.3_1 VirtualBox VM. >> >> >> >> If I go into the guest spark-shell and refer to the file thus, it works >> >> fine >> >> >> >> val words=sc.textFile("hdfs:///tmp/people.txt") >> >> words.count >> >> >> >> However if I try to access it from a local Spark app on my Windows >> >> host, >> >> it >> >> doesn't work >> >> >> >> val conf = new SparkConf().setMaster("local").setAppName("My App") >> >> val sc = new SparkContext(conf) >> >> >> >> val words=sc.textFile("hdfs://localhost:8020/tmp/people.txt") >> >> words.count >> >> >> >> Emits >> >> >> >> >> >> >> >> The port 8020 is open, and if I choose the wrong file name, it will >> >> tell >> >> me >> >> >> >> >> >> >> >> My pom has >> >> >> >> <dependency> >> >> <groupId>org.apache.spark</groupId> >> >> <artifactId>spark-core_2.11</artifactId> >> >> <version>1.4.1</version> >> >> <scope>provided</scope> >> >> </dependency> >> >> >> >> Am I doing something wrong? >> >> >> >> Thanks. >> >> >> >> >> >> >> >> >> >> -- >> >> View this message in context: >> >> >> >> http://apache-spark-user-list.1001560.n3.nabble.com/Local-Spark-talking-to-remote-HDFS-tp24425.html >> >> Sent from the Apache Spark User List mailing list archive at >> >> Nabble.com. >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org