Re: [discuss] Modernization of Cassandra build system

2015-03-31 Thread Tyler Hobbs
Hi Łukasz,

I'm not very familiar with the build system, but I'll try to respond.

The Serializer dependencies on org.apache.cassandra.transport are almost
certainly uses of Server.CURRENT_VERSION and Server.VERSION_3.  These are
constants that represent the native protocol version in use, which affects
how certain types are serialized.  These constants could easily be moved.

The o.a.c.marshal dependency in MapSerializer is on AbstractType, but could
easily be replaced with java.util.Comparator.

In any case, I'm not necessarily opposed to improving the build system to
make these errors more apparent.  Would your proposal still allow us to
build with ant (and just change the way those artifacts are built)?

On Tue, Mar 24, 2015 at 7:58 PM, Łukasz Dywicki l...@code-house.org wrote:

 Dear cassandra commiters and development process followers,
 I would like to bring an important topic off build process of cassandra. I
 am an external user from community point of view, however I been walking
 around various  projects close to cassandra over past year or even more.
 What is worrying me a lot is how cassandra is publishing artifacts and how
 many problems are reported due that.

 First of all - I want to note that I am not born enemy of Ant itself. I
 never used it. I am also aware of problems with custom builds made with
 Maven, however I don’t really want to discuss any particular replacement,
 yet I want to note that Cassandra JIRA project contains about 116 issues
 related somehow to maven (http://bit.ly/1GRoXl5 http://bit.ly/1GRoXl5,
 project=CASSANDRA, text ~ maven). Depends on the point of view it might be
 a lot or a little. By simple statistics it is around 21 issues a year or
 almost 2 issues a month, many of them breaking maintanance/major releases
 from user point of view. From other hand it’s not bad considering how
 project is being built.

 Current structure has a very big disadvantage - ONE source root for
 multiple artifacts published in maven repositories and copying classes to
 jar AFTER they are compiled. Obviously ant copy task doesn’t follow import
 statements and does not include dependant classes. For example just by
 making test relocations and extraction of clientutil jar on master branch
 into separate source root I have found a bug where ListSerializer depends
 on org.apache.cassandra.transpor package. More over clientutil
 (MapSerializer) does depends on org.apache.cassandra.db.marshal package
 leading to the fact that it can not be used without cassandra-all present
 at classpath.
 Luckily for cassandra CQL as a new interface reduces thrift and clientutil
 usage reducing amount of issues reported around these, however this just
 hides a real problem in previous paragraph. I have found a handy tool and
 made a graph of circular dependencies in cassandra-all.jar. Graph of
 results can found here: http://grab.by/FRnO http://grab.by/FRnO. As you
 can see this graph has multiple levels and solving it is not a simple task.
 I am afraid a current way of building and packaging cassandra can create
 huge hiccups when it will come to code rafactorings cause entire cassandra
 will become a house of cards.
 Restructuring project into smaller pieces is also beneficiary for
 community since solving bugs in smaller units is definitelly easier.

 At the end of this mail I would like to propose moving Cassandra build
 system forward, regardless of tool which will be choosen for it. Personally
 I can volunteer in maven related changes to extract cassandra-thrift,
 cassandra-clientutil and cassandra-all to make regular maven build. It
 might be seen as a switch from one big XML into couple smaller. :-) All
 this depends on Cassandra developers decission to devide source roots or
 not.

 Kind regards,
 Łukasz Dywicki
 —
 l...@code-house.org
 Twitter: ldywicki
 Blog: http://dywicki.pl
 Code-House - http://code-house.org




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: [discuss] Modernization of Cassandra build system

2015-03-31 Thread Benedict Elliott Smith
I think the problem is everyone currently contributing is comfortable with
ant, and as much as it is imperfect, it isn't clear maven is going to be
better. Having the requisite maven functionality linked under the hood
doesn't seem particularly preferable to the inverse. The status quo has the
bonus of zero upheaval for the project and its contributors, though, so it
would have to be a very clear win to justify the change in my opinion.


On Tue, Mar 31, 2015 at 10:24 PM, Łukasz Dywicki l...@code-house.org
wrote:

 Hey Tyler,
 Thank you very much for coming back. I already lost faith that I will get
 reply. :-) I am fine with code relocations. Moving constants into one place
 where they cause no circular dependencies is cool, I’m all for doing such
 thing.

 Currently Cassandra uses ant for doing some of maven functionalities (such
 deploying POM.xml into repositories with dependency information), it uses
 also maven type of artifact repositories. This can be easily flipped. Maven
 can call ant tasks for these parts which can not be made with existing
 maven plugins. Here is simplest example:
 http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin 
 http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin - you can see
 ant task definition embedded in maven pom.xml.

 Most of things can be made at this moment via maven plugins:
 apache-rat-plugin:
 http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11 
 http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11
 maven-thrift-plugin:
 http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11
 
 http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11
 
 antlr4-maven-plugin:
 http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5 
 http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5 or
 antlr3-maven-plugin:
 http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2 
 http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2
 maven-gpg-plugin:
 http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6
 
 http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6
 
 maven-cobertura-plugin: http://mojo.codehaus.org/cobertura-maven-plugin/ 
 http://mojo.codehaus.org/cobertura-maven-plugin/ (but these days jacoco
 with java agent instrumentation perfoms better)
 .. and so on

 I already made some evaluation of impact and it is big. Code has to be
 separated into different source roots. It’s not easy even for keeping
 current artifact structure: cassandra-all, cassandra-thrift and clientutil
 (cause of cyclic dependencies). What I can do is prepare of these src roots
 with dependencies which are declared for them and push that to my cassandra
 fork so you will be able to verify that and continue with relocations if
 you will like new build. Creating new modules (source roots) with maven is
 simple so you could possibly extract more than these 3 predefined
 artifacts/package roots.
 Just let me know if you are interested.

 Kind regards,
 Lukasz


  Wiadomość napisana przez Tyler Hobbs ty...@datastax.com w dniu 31 mar
 2015, o godz. 21:57:
 
  Hi Łukasz,
 
  I'm not very familiar with the build system, but I'll try to respond.
 
  The Serializer dependencies on org.apache.cassandra.transport are almost
  certainly uses of Server.CURRENT_VERSION and Server.VERSION_3.  These are
  constants that represent the native protocol version in use, which
 affects
  how certain types are serialized.  These constants could easily be moved.
 
  The o.a.c.marshal dependency in MapSerializer is on AbstractType, but
 could
  easily be replaced with java.util.Comparator.
 
  In any case, I'm not necessarily opposed to improving the build system to
  make these errors more apparent.  Would your proposal still allow us to
  build with ant (and just change the way those artifacts are built)?
 
  On Tue, Mar 24, 2015 at 7:58 PM, Łukasz Dywicki l...@code-house.org
 mailto:l...@code-house.org wrote:
 
  Dear cassandra commiters and development process followers,
  I would like to bring an important topic off build process of
 cassandra. I
  am an external user from community point of view, however I been walking
  around various  projects close to cassandra over past year or even more.
  What is worrying me a lot is how cassandra is publishing artifacts and
 how
  many problems are reported due that.
 
  First of all - I want to note that I am not born enemy of Ant itself. I
  never used it. I am also aware of problems with custom builds made with
  Maven, however I don’t really want to discuss any particular
 replacement,
  yet I want to note that Cassandra JIRA project contains about 116 issues
  related somehow to maven (http://bit.ly/1GRoXl5 http://bit.ly/1GRoXl5
 http://bit.ly/1GRoXl5 http://bit.ly/1GRoXl5,
  project=CASSANDRA, text ~ maven). Depends on the point of view it might
 be
  a lot or a 

Re: [discuss] Modernization of Cassandra build system

2015-03-31 Thread Łukasz Dywicki
Hey Tyler,
Thank you very much for coming back. I already lost faith that I will get 
reply. :-) I am fine with code relocations. Moving constants into one place 
where they cause no circular dependencies is cool, I’m all for doing such thing.

Currently Cassandra uses ant for doing some of maven functionalities (such 
deploying POM.xml into repositories with dependency information), it uses also 
maven type of artifact repositories. This can be easily flipped. Maven can call 
ant tasks for these parts which can not be made with existing maven plugins. 
Here is simplest example: 
http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin 
http://docs.codehaus.org/display/MAVENUSER/Antrun+Plugin - you can see ant 
task definition embedded in maven pom.xml.

Most of things can be made at this moment via maven plugins:
apache-rat-plugin: 
http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11 
http://mvnrepository.com/artifact/org.apache.rat/apache-rat-plugin/0.11
maven-thrift-plugin: 
http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11
 
http://mvnrepository.com/artifact/org.apache.thrift.tools/maven-thrift-plugin/0.1.11
antlr4-maven-plugin: 
http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5 
http://mvnrepository.com/artifact/org.antlr/antlr4-maven-plugin/4.5 or
antlr3-maven-plugin: 
http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2 
http://mvnrepository.com/artifact/org.antlr/antlr3-maven-plugin/3.5.2
maven-gpg-plugin: 
http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6 
http://mvnrepository.com/artifact/org.apache.maven.plugins/maven-gpg-plugin/1.6
maven-cobertura-plugin: http://mojo.codehaus.org/cobertura-maven-plugin/ 
http://mojo.codehaus.org/cobertura-maven-plugin/ (but these days jacoco with 
java agent instrumentation perfoms better)
.. and so on

I already made some evaluation of impact and it is big. Code has to be 
separated into different source roots. It’s not easy even for keeping current 
artifact structure: cassandra-all, cassandra-thrift and clientutil (cause of 
cyclic dependencies). What I can do is prepare of these src roots with 
dependencies which are declared for them and push that to my cassandra fork so 
you will be able to verify that and continue with relocations if you will like 
new build. Creating new modules (source roots) with maven is simple so you 
could possibly extract more than these 3 predefined artifacts/package roots.
Just let me know if you are interested.

Kind regards,
Lukasz


 Wiadomość napisana przez Tyler Hobbs ty...@datastax.com w dniu 31 mar 2015, 
 o godz. 21:57:
 
 Hi Łukasz,
 
 I'm not very familiar with the build system, but I'll try to respond.
 
 The Serializer dependencies on org.apache.cassandra.transport are almost
 certainly uses of Server.CURRENT_VERSION and Server.VERSION_3.  These are
 constants that represent the native protocol version in use, which affects
 how certain types are serialized.  These constants could easily be moved.
 
 The o.a.c.marshal dependency in MapSerializer is on AbstractType, but could
 easily be replaced with java.util.Comparator.
 
 In any case, I'm not necessarily opposed to improving the build system to
 make these errors more apparent.  Would your proposal still allow us to
 build with ant (and just change the way those artifacts are built)?
 
 On Tue, Mar 24, 2015 at 7:58 PM, Łukasz Dywicki l...@code-house.org 
 mailto:l...@code-house.org wrote:
 
 Dear cassandra commiters and development process followers,
 I would like to bring an important topic off build process of cassandra. I
 am an external user from community point of view, however I been walking
 around various  projects close to cassandra over past year or even more.
 What is worrying me a lot is how cassandra is publishing artifacts and how
 many problems are reported due that.
 
 First of all - I want to note that I am not born enemy of Ant itself. I
 never used it. I am also aware of problems with custom builds made with
 Maven, however I don’t really want to discuss any particular replacement,
 yet I want to note that Cassandra JIRA project contains about 116 issues
 related somehow to maven (http://bit.ly/1GRoXl5 http://bit.ly/1GRoXl5 
 http://bit.ly/1GRoXl5 http://bit.ly/1GRoXl5,
 project=CASSANDRA, text ~ maven). Depends on the point of view it might be
 a lot or a little. By simple statistics it is around 21 issues a year or
 almost 2 issues a month, many of them breaking maintanance/major releases
 from user point of view. From other hand it’s not bad considering how
 project is being built.
 
 Current structure has a very big disadvantage - ONE source root for
 multiple artifacts published in maven repositories and copying classes to
 jar AFTER they are compiled. Obviously ant copy task doesn’t follow import
 statements and does not include dependant classes. For example just by
 making test relocations and extraction of clientutil