I often do this, and then just register one giant .jar
<!-- Plugin to create a single jar that includes all dependencies -->
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
On Thu, Aug 8, 2013 at 4:32 PM, Paul Houle <[email protected]> wrote:
> I'm building a system for processing large RDF data sets with Hadoop.
>
> https://github.com/paulhoule/infovore/wiki
>
> The first stages are written in Java and perform the function of
> normalizing, validating and cleaning up the data.
>
> The stage that comes after this is going to subdivide Freebase into several
> major "horizontal" subdivisions that users may or may not want. For
> instance, Freebase uses two different vocabularies for expressing external
> keys -- they both represent 100+ million plus facts so it's desirable to
> pick one you like and throw the other in the bit bucket.
>
> That phase will probably be written in Java, but to do the research to
> figure out how to partition it, I want to do ad-hoc queries with Pig.
>
> The first thing I'm working on is a input UDF for reading N-Triples files;
> rather than deeply parsing the Nodes, I'm splitting the triples up into
> three Texts. This process isn't too different from reading a white-space
> separated file, but it's a little more complicated because sometimes there
> are spaces in the object field. You also need to trim off a period and
> maybe some whitespace at the end.
>
> Now, it turns out the my UDF depends on classes I wrote distributed
> throughout three different Maven projects (the PrimitiveTriple parser has
> been around for a while) so I need to REGISTER multiple Jar files. I also
> heavily use Guava and other third-party libraries so the list of things I
> need to REGISTER is pretty big
>
> What I'm trying now is to run this program
>
> https://github.com/paulhoule/infovore/blob/master/chopper/src/main/java/com/ontology2/chopper/tools/GenerateRegisterStatements.java
>
> piping it like so
>
> mvn dependency::build-classpath | mvn exec::java
> -Dexec.mainClass=com.ontology2.chopper.tools.GenerateRegisterStatements
>
> This could be integrated into the maven build process in the future.
>
> Anyway, is there a better way to do this?