For simple testing you can use cloudera quickstart vm. All the lzo stuff can be configured in cloudera manager with a few clicks. 10 янв. 2014 г. 21:55 пользователь "Peter Sanford" <[email protected]> написал:
> Hello everybody! > > I'm getting started with pig and I'm trying to understand how to > configure io.compression.codecs for local mode. I've got something working > but I'm not entirely clear on why it works or if there is a better way to > do this. > > For some background, I'm trying to read data into pig from an lzo > compressed file that contains protobuf entries (using elephant-bird[1] and > hadoop-lzo[2]). The instructions for setting up hadoop-lzo say that you > need to add an entry to core-site.xml to specify the lzo compression codec. > Since I'm running in local mode I don't have a core-site.xml and couldn't > figure out if there was another place to configure io.compression.codecs. > Unsurprisingly, when I run a script that tries to load an lzo file I get a > 'No codec for file' error. > > I discovered that if I create a core-site.xml and put it in my > PIG_CLASSPATH, I can start pig without the '-x local' flag. When I run pig > like this it is able to load the lzo file and everything works correctly. > > Since I don't have hadoop installed, I assume that pig is still running in > some sort of local mode. Can someone explain what the difference is between > my two test cases above? > > Is there a different (better?) way to configure io.compression.codecs when > running in local mode? > > Thanks for the help, > -Peter > > [1]: https://github.com/kevinweil/elephant-bird > [2]: https://github.com/twitter/hadoop-lzo >
