[ 
https://issues.apache.org/jira/browse/PHOENIX-876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel Reid updated PHOENIX-876:
---------------------------------

    Attachment: PHOENIX-876.patch

Patch to remove the mapreduce assembly. I've tested this on a real cluster (3.0 
branch, HBase 1) and it works. 

The most important change that this makes is that it's no longer possible to 
run the client jar as a standalone jarfile, as there's no longer a mainClass 
value set in the manifest of that jar. However, this wasn't necessary anyhow 
when running it via psql.py, so I assume that's ok.

With this change, the mapreduce import can be started up as 
{code}./bin/hadoop jar phoenix-3.0.0-incubating-SNAPSHOT-client.jar 
org.apache.phoenix.mapreduce.CsvBulkLoadTool -t mytab -i /tmp/data.csv{code}

In other words, starting the MR job also requires supplying a class name on the 
commandline (which is also different than how it was with a specific MR 
assembly).

BTW, the Hadoop client jars are not necessary in order to run the MR job -- 
that's what running it with "hadoop jar" provides you with.

> Remove phoenix-*-mapreduce.jar if not necessary
> -----------------------------------------------
>
>                 Key: PHOENIX-876
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-876
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 4.0.0
>            Reporter: James Taylor
>            Assignee: Gabriel Reid
>         Attachments: PHOENIX-876.patch
>
>
> Do we need the 
> phoenix-assembly/target/phoenix-3.0.0-incubating-SNAPSHOT-mapreduce.jar to be 
> produced, as it pulls in a ton of stuff? Can our standard client jar be used 
> plus the hadoop.jar instead? If we don't need it, we should update the 
> pom/build stuff to not produce it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to