Hey Andreas,

On Thu, Nov 6, 2014 at 10:53 AM, Andreas Tille <[email protected]> wrote:

> >
> > Upstream for Debian means a temporary upstream placeholder to use for
> > package building, whilst the KMC authors have a discussion with the
> asmlib
> > author, to try to get him to distribute his code in a VCS manner.
> > If you look at [1], the code is distributed as a zip file that contains
> the
> > compiled static libraries for several OS's and architectures. The source
> > code is another zip file inside the original one. It contains a pdf plus
> > other zip files. The source code itself uses a nmake makefile that
> performs
> > cross compilation, through the use of yet another piece of code written
> by
> > the author.
>
> OK, I see.  However, I'd prefer a debian/get-orig-script fetching the
> zipfile and extracting the source from there.
>

Sounds good. Looking into this now.


>
> > There is no versioning system whatsoever for asmlib. The author simply
> > states that the most up to date version is always available in [1].
>
> There is a nice guide for upstream developers who tell you things like
> this.  You should rather point upstream to
>
>    https://wiki.debian.org/UpstreamGuide#Releases_and_Versions
>
> than trying to fix things on behalf of upstream.


Thanks for the link. Having a look at it as well.


>   BTW, I wonder what
> amount of speed gain KMC authors are expecting from a library written in
> assembly?  These days compilers are really optimised.  If I see source
> files like memcmp64.asm, memcpy64.asm etc I *really* wonder what I
> should expect from an author who fails to comply to some basic rules
> like releasing versioned tarballs.  If *I* would try to develop a piece
> of software I would not rely on such code, sorry.
>
>
It was my initial thought.
Here's what they had to say about it:

Jorge said:

> I am not completely sure, having read the readme.txt file provided with
> the source distribution, whether the source code also includes these
> libraries:
> * asmlib - for fast memcpy operation (
> http://www.agner.org/optimize/asmlib-instructions.pdf)
> * libbzip2 - for support for bzip2-compressed input FASTQ/FASTA files (
> http://www.bzip.org/)
> * zlib - for support for gzip-compressed input FASTQ/FASTA files (
> http://www.zlib.net/)


KMC said:

>  These libraries are provided with KMC source codes in a binary form. You
can find them here:
https://github.com/js21/kmc/tree/master/kmer_counter/libs
The ones with "lib" extension are for windows compilation (Visual Studio
compiler) and the ones with "a" extension are for linux compilation (g++
compiler).

Jorge said:

  I know that zlib is a available as a debian package and it will be easy
> to simply state it as a dependency for KMC.
> Running apt-cache search for libbzip2 I can see the following:
> js21@builder2:~/deb_alioth/current/kmc_packaging/temp/kmc$ apt-cache
> search libbzip2
> lib32bz2-1.0 - high-quality block-sorting file compressor library - 32bit
> runtime
> libbz2-1.0 - high-quality block-sorting file compressor library - runtime
>
> I'm not sure whether this is the library you generally use, but I'm sure
> it would do the same job. We can again state this as a dependency of the
> KMC package and it will be installed in the user's system at KMC build time
> The library that raises a flag for me is the asmlib. I can find a libasm
> through apt-cache, but this seems to be referenced as a library for Java.
> (crazy java)
> If this library is essential for KMC, then we will need to ask the Debian
> Med list if this is already available through Debian. If not, we will
> probably have to create a package for it as well.


KMC said:

 Asmlib is essential in performance sense. In KMC there is a lot of work
with buffers, sometimes these buffers or its part must be copied and asmlib
provide quicker copying than standard implementation.

 I am not really sure how debian packages works, but if it is possible to
provide binary libraries inside package, KMC should work without any
dependencies.

END_OF_KMC_JORGE_TALK



> > > I do not see any point for not using straight upstream
> > > repository.
> >
> > I am hoping that the KMC authors will convince the asmlib author to
> > distribute his code through a VCS.
>
> In the sense what I wrote above I personally would be happy if KMC
> authors would use plain C/C++ code and do some serious testing what
> speed they really gain and how maintainable their code would be.
>

I will see if I can get some evidence form the KMC authors as to why they
choose this implementation, rather than native.


>
> > >   It seems Git addicts have a lot of fun by cloning things.
> >
> > I think you're assuming too much here.
>
> Yes, you are right.  I should have checked in advance.  Sorry.
>

It's OK. Thanks for saying that.


>
> > In my view, if the upstream author sees the simplicity of what I have
> done,
> > I believe it would be easier to convince him/her to use whatever VCS. In
> my
> > case Git, because I am now used to it.
>
> OK, that's a valid point.
>

+1


>
> > >     git import-orig --pristine-tar
> > >
> > > since the repository has no upstream branch.  Please try to follow our
> > > zeam policy as closely as possible.  Otherwise your coworkers will have
> > > trouble to
> >
> > I have used it I think I just forgot to push it. It's now pushed for
> asmlib.
>
> OK, confirmed that I was able to pull this.
>

Sweet!


>
> > > > genomes, not as in machine code) IVA written in python by a member
> of my
> > > > team over at the Wellcome Trust Sanger Institute.
> > > >
> > > > [7] Original upstream - https://github.com/sanger-pathogens/iva
> > >
> > > Sounds good!
> >
> > At least something...
>
> Well, it seems my murmuring was a bit depressing to you.  This was not
> intended.  May be the unusual way of source distribution has fired back
> to you since I did not expected this kind of trouble and was not
> checking properly.
>

Don't worry about it. It's all a dynamic process and it's not that it was
depressing. You had no clear picture of what was in my head.
I also thought this task would be much simpler than what it's turning out
to be.
I would like to see it through though.


>
> > I'll work through quilt patches.
> > I only removed code to see if I could build the package in my machine.
> > And since the nmake makefile is completely useless with make + the only
> > libraries useful for Debian would be the 32bit and and 64bit Unix
> libraries
> > + I'm not making any changes to the original source file, simply adding a
> > Unix Makefile, I thought this would be a simple solution.
>
> Here we are facing another drawback of asmlib usage:  KMC authors are
> excluding promising architectures like arm64 and ppc64el which might in
> the not so distant future could become relevant for tasks in
> bioinformatics.
>

Very valid point. I will bring this point accross on the next communication.
I can cc you if you want. If you want to talk to them directly, that could
also help.
The two guys I've been talking to are:

KMC developer - [email protected]
KMC coordinator (I think) - [email protected]


>
> > > Yes, for sure.  I think the decision to package asmlib separately was
> > > drawn in the beginning of this discussion.
> > >
> >
> > This email was sent to both Debian Med and Debian Mentors. I thought it
> > best to be as verbose as I could be.
>
> I missed this and I'm now CCing debian-mentors as well.
>
>
Cool!


> Summary: I would try to discuss with KMC developers whether they would
> see any chance to make amslib optional and could provide the full
> functionality without this library.  Writing assembly language is to the
> best of my knowledge something you did in the 90th of last century.
> Trying to reimplement things like memcmp, memcpy etc is something you
> should avoid IMHO.
>
>
100% agreed. As above, approached the subject, and there were arguments for
using these reimplementations.
I'll send them an email later today. They are busy writing the next version
of KMC and have become slightly irresponsive as of late.
Fingers crossed.

Kind regards,

Jorge

Reply via email to