Hi Renjith, I started from setting up a single node by following this https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html, then multiple nodes, and https://hadoop.apache.org/docs/r1.2.1/streaming.html. Also some online tutorials.
Thanks, Anna On Tue, Mar 15, 2016 at 10:34 AM, Anna Guan <[email protected]> wrote: > Hello, > I am new to Hadoop, and trying to run legacy c++ program to process > geotiff file using Hadoop Streaming. The version of Hadoop is 2.6.2. > I 'd like to open an image in hdfs file system and use hdfsRead to read > the file into a memory buffer and then use GDAL library to create a virtual > memory file so that I can create GDALDataset from it. I'd like to read the > whole file into the buffer however hdfsRead() only read 65536 bytes every > time. Is there any way to read the entire file into the buffer? I also set > dfs.image.transfer.chunksize > in the config file but it did not help. When I run it I got ERROR 4: > `/vsimem/l1' not recognised as a supported file format. I think this is > because I did not set the buffer properly. Can anyone kindly tell me if it > is possible or not? > many thanks! > Anna Guan > > // open hdfs file > hdfsFile lfs = hdfsOpenFile(fs, "/input/L1.tif", O_RDONLY, 134217728, > 0, 0); > int size = hdfsAvailable(fs, lfs) ; > char * data_buffer = (char*)CPLMalloc(size); > int hasdata = -1; > tOffset offset = hdfsTell(fs, lfs); > > while(hasdata){ > hasdata = hdfsRead(fs, lfs, data_buffer, size) ; > } > hdfsSeek(fs, lfs, offset); > VSIFCloseL(VSIFileFromMemBuffer( "/vsimem/l1", (GByte*)data_buffer, > size, FALSE )); > GDALAllRegister(); > GDALDataset* readDS = (GDALDataset*)GDALOpen("/vsimem/l1",GA_ReadOnly); >
