[Denovoassembler-users] Release of Ray v2.1.0 (mostly bug fixes)

Sébastien Boisvert Tue, 30 Oct 2012 16:04:05 -0700

Hello,

Ray v2.0.0 was released on 2012-06-22. It is time to release Ray v2.1.0 !


It is available directly at

     http://sourceforge.net/projects/denovoassembler/files/Ray-v2.1.0.tar.bz2

Documentation was added for the metagenomics solutions called 'Ray Méta',
'Ray Communities', and 'Ray Ontologies' that are implemented in Ray plugins.

Changes in bioinformatics algorithm implementations:

Changes include a new data reliability option, options to control the maximum 
(or
minimum) accepted k-mer coverage, a fix for a race condition in the plugin that 
colors
the graph, new options for the storage engine, faster network tests, fixes for 
input files
compressed with bunzip2, ability to disable scaffolding, various portability 
fixes, patches
for twin k-mers (efficient storage), faster building of the distributed graph,

Changes in the runtime engine:

The distributed storage backend was optimized, added hardware acceleration with 
pop count
when available, new registration system for plugins, bug fixes in the hash 
table, default
communication model is now MPI_Iprobe / MPI_ANY_SOURCE, new routines for dirty 
buffer
management, polytope communication graph.
  
Full list:

---
Changes between Ray v2.0.0 and Ray v2.1.0:

  100 files changed, 4294 insertions(+), 2398 deletions(-)

Pier-Luc Plante (3):
       Scaffolder is not required when using unpaired reads.
       Patch Koala: Added an option (-use-maximum-seed-coverage) so that 
higly-covered seeds can be ignored.
       Corrected the tet that determines the quality control results.     There 
was too much false negatives. The returned value is more     reliable now.

Sébastien Boisvert (142):
       The copyright was updated to add 2012.
       When there are 508 reads and 32 MPI ranks, the number of reads     per 
rank is 508/32= 15. Therefore, assuming a perfect division     read number 495 
would be on MPI rank 33 (495/15 = 33). This makes     Ray crash. This change 
set corrects this.
       A list of releases was added.
       The codename of the next release will be "Ancient Granularity of Epochs".
       An assertion was added for the performance scaled messaging related     
bug.
       Two assertions were added to detect possible message corruption.
       The help page was update to add the data reliability option.     
Signed-off-by: Sébastien Boisvert <[email protected]>
       The peak finder was modified to pass new tests.
       I edited the guide to submit changes.
       The manual now includes the new option for overly-covered seeds.
       A error was fixed in the file that says how to submit changes.
       The return statement was misplaced in a recent patch.
       I added the names 'Ray Méta', 'Ray Communities', and 'Ray Ontologies'.
       An assertion was added to make sure that data is not overwritten.
       Searcher: added verbose statements
       Searcher: fixed a race condition
       Searcher: added a missing value.
       SeedExtender: moved system calls inside this plugin
       SeedExtender: modified the code for hot skipping
       SeedExtender: implemented hot skipping
       Parameters: 4 options were added to change distributed storage behavior.
       Documentation: Ray can be run with a single configuration file 
containing options.
       The default load factor threshold was changed to 0.75.
       The methods setKey() and getKey() were added to KmerCandidate     and 
Vertex classes for compatibility with MyHashTable.
       If the hash table is verbose, ask it to display its status.
       NetworkTest: added the option -skip-network-test to skip the network 
test.
       Added a new option to enable genome neighbourhood calculation.     The 
option is -find-neighbourhoods
       I added some code to detect windows 32 bits and windows 64 bits.
       More parameters for compilation can be provided with EXTRA=...
       Porting Ray to the new RayPlatform: removed macro calls in .h files.
       Porting Ray to the new RayPlatform: removed remaining codes in .h.
       Porting Ray to the new RayPlatform: removed token 
'generated_automatically'.
       Porting Ray to the new RayPlatform: added CreatePlugin and BindPlugin    
 instructions.
       Porting Ray to the new RayPlatform: updated the macro names in C++     
plugin files.
       Porting Ray to the new RayPlatform: removed adapter from plugin     
class definitions.
       Porting Ray to the new RayPlatform: remove calls to setObject.
       Porting Ray to the new RayPlatform: Ray compiles with the simplified     
RayPlatform adapters now.
       I removed handlers from the cmake file.
       Updating the manual.
       SeedExtender: changed the verbosity period.
       Removed some output from the computation of seeds.
       The manual was updated to include pointers to documentation.
       If you run Ray with a configuration file (mpiexec -n 4 Ray Ray.conf)     
you can start comments with the '#' symbol like in python.
       Information to compile Ray with gcc was added.
       The default number of buckets is now 1048576. The default number of     
buckets per group is still 64, so that is only 16384 groups with     almost no 
memory usage because it is sparse.
       This fixes a input/output bug for the Ray configuration file.
       The code that randomizes the arguments was removed because it can     
lead to bugs. This also simplifies checkpointing.
       The edge purging should be done in a massively parallel way unless     
the option -write-kmers was provided.
       Merge branch 'master' of https://github.com/plpla/ray into pl
       I added a script to build Ray with link time optimization.
       The EXTRA commands are also given to the linking command.
       I added -fwhole-program for better optimization.
       I added compilation flags for compression.
       I added instructions to build Ray with link time optimization.
       NetworkTest: the number of test messages is now constant regardless     
of the number of MPI ranks in the communicator.
       application_core: added a call to obtain a string configuration     
token.
       KmerAcademyBuilder: option -bloom-filter-bits can sets the number     of 
bits.
       KmerAcademyBuilder: Bloom filter has 64 M bits by default.
       Merge branch 'master' of github.com:sebhtml/ray
       Merge branch 'master' of github.com:sebhtml/ray
       SequencesLoader: added a 'please wait' before counting entries in     a 
file.
       SequencesLoader: a bz2 file can contain many compressed streams.     
Each of them needs to be opened, read (until BZ_STREAM_END), and     closed.
       application_core: bugs were fixed in the configuration routines.
       GeneOntology: removed the use of argv
       Merge branch 'master' of github.com:sebhtml/ray
       Merge branch 'master' of github.com:sebhtml/ray
       Fixed an integer overflow in the distributed storage engine.
       A path with 0 k-mers has 0 nucleotides, not 0-k+1.
       Merge branch 'master' of github.com:sebhtml/ray
       A new routing graph is available: the hypercube.
       Documentation: documented the hypercube features of Ray.
       core: the default number of buckets is now 268435456 per rank.
       scaffolder: it can be disabled with -disable-scaffolder
       normalized option names with -enable-* and -disable-*
       documentation: moved assembly options up
       core: added documentation for class Parameters.
       SeedingData: -use-minimum-seed-coverage changes the minimum
       documentation: added missing operands in the manual and -help page
       core: Ray -version provides more compile flags like popcnt and sse
       SeedingData: seeds can not contain k-mers with too low coverage
       build: the C++ standard is C++ 1998. gcc -ansi provides that
       Searcher: large integer constants needs ULL for portability
       SeedExtender: added additional information for an error
       MessageProcessor: k-mer data messages should never be discarded
       VerticesExtractor: don't flush while waiting for messages
       KmerAcademyBuilder: only send the forward k-mer, not the lower
       VerticesExtractor: improved the code quality for easier reading
       MessageProcessor: don't discard k-mers while receiving messages
       VerticesExtractor: store twin edges in a single source
       EdgePurger: any edge is removed only if a end is not in the graph
       MessageProcessor: removed a call to a private attribute
       Documentation: added a document about profiling Ray
       Documentation: added information about elapsed time
       BuildSystem: added a strip command to reduce the memory footprint
       BuildSystem: replaced -ansi with -std=c++98 for more verbosity
       Documentation: updated the author file
       KmerAcademyBuilder: removed the k-mer academy
       VerticesExtractor: this module extracts vertices to add edges
       Merge branch 'kill-kmer-academy'
       MessageProcessor: new text to show when the Bloom filter is created
       KmerAcademyBuilder: added the number of set bits in the Bloom filter
       MessageProcessor: added a warning when the oracle is half full
       KmerAcademyBuilder: the Bloom filter can have any number of bits
       Merge branch 'bloom-features'
       MessageProcessor: coverage depth starts at 1 with Bloom filters
       MessageProcessor: the thresold is 50.0 (50.0%), not 0.5
       KmerAcademyBuilder: added the number of filtered k-mers
       Merge branch 'bug-hunting'
       application_core: added routing with a convex regular polytope
       NetworkTest: the number of exchange can be changed with -exchanges
       Documentation: added options for a 64-rank polytope
       Documentation: updated the taxonomy documentation
       NetworkTest: added average round trip latency
       scripts: initial version of a script to create NCBI taxonomy
       scripts: download NCBI bacterial genomes too
       Merge branch 'master' of github.com:sebhtml/ray
       Documentation: added documentation for NCBI taxonomy
       Documentation: simplified the usage of the tool to pull NCBI data     
Signed-off-by: Sébastien Boisvert <[email protected]>
       scripts: the script that pulls NCBI data is almost ready
       scripts: the script that pulls NCBI stuff is ready
       Documentation: added information about XML files
       Partitioner: also create a file FilePartition.txt
       MachineHelper: don't run the AMOS code path if not necessary
       Parameters: throw a warning when distances are invalid
       Merge branch 'for-seb-September-2012'
       Searcher: fixed a race condition where a message was lost
       Calls to deprecated methods were eliminated.
       This is Ray v2.1.0-rc0 "Ancient Granularity of Epochs"
       Searcher: browsing the distributed colored de Bruijn subgraph
       Searcher: find or create a virtual color from physical colors
       Searcher: added physical color in SequenceAbundances.xml
       Searcher: fixed assertion code
       scripts: don't ship the example and only ship the bz2 distribution
       SequencesLoader: fixed the scope of a buffer
       Searcher: removed debug messages from stable release
       Documentation: added more documentation for gene ontology.
       Searcher: fixed buffer overflow
       Searcher: fixed compilation warnings
       Searcher: GraphBrowsing.xml needs -one-color-per-file
       This is the branch for Ray v2.1.0-rc1
       Related git repositories were added in the README.
       Ray v2.1.0

---
Changes between RayPlatform v1.0.3 and RayPlatform v1.1.0:

  52 files changed, 3215 insertions(+), 1244 deletions(-)

Sébastien Boisvert (58):
       A release list was added.
       Message checksum are calculated by default for any non-empty message     
by RayPlatform.
       The option -verify-message-integrity must be provided to enable     
message integrity verification in RayPlatform.     By default, the checksum is 
calculated by the software.
       A integer comparison was fixed.
       I implemented a system of annotation for buffers. With this,     
RayPlatform knows which buffer is dirty (possibly available,     but maybe not) 
and which buffer is available.
       I fixed a typographical error in the documentation.
       I added a comment for dirty buffers. Because MPI_Request objects are     
usually "completed" before the message is actually on the destination,     I 
don't think the RayPlatform virtual machine is going to run out of     
non-dirty buffer.
       The latency on a IBM iDataPlex (guillimin at McGill) for a Ray job of    
 36 cores was reduced from 23 to 17 microseconds (back and forth).
       I cleaned the persistent communication code.
       Merge branch 'master' of github.com:sebhtml/RayPlatform
       The three communication models were documented in the source code.     
The three models are:
       The constructor of the hash table now takes the number of buckets, the 
number of     buckets per group, and load factor threshold as well as the 
verbosity.
       structures: increased portability of the hash table code.
       The class for hash table groups was moved to its own file.
       This fixes a bug introduced while working on the portability.
       The table prints its status after completion of the resizing,     when 
in verbose mode.
       I added David Weese of Free University of Berlin in the code as     he 
reviewed the hash table code.
       structure: using compiler builtins for some processing in the hash table.
       The specific code was moved inside one portable method.
       I added some comments in the ring allocator.
       Status is not printed if verbosity is not enabled.
       The registration system for plugins was changed. Now it uses function 
pointers     instead of virtual methods, which can be slow as they can not be 
inlined.
       I added MessageWarden in the README.
       I added some documentation for handlers.
       Some more documentation was added.
       This fixes a bug in the insert() operation of the hash table during     
incremental resizing.
       h1 must return something between 0 and M-1 whereas h2 must return     
something odd between 1 and M-1. This was fixed in the code.
       The hash table also prints memory allocation information when     
printing its status.
       communication: switched the model to MPI_ANY_SOURCE.
       Added routines to clean dirty buffers when they are all dirty.
       A new routing graph is available: it is the hypercube.
       The hypercube prints its status before the end.
       routing: added status code for hypercube.
       communication: improved the last step in routing.
       routing: started to implement a round-robin policy for hypercube     
routing.
       routing: the round-robin hypercube is available in the code.
       routing: the hypercube can be modified to be a pseudo-hypercube
       communitation: increased the number of buffers for messaging
       communication: removed a useless line in the code
       Updated the code name for the upcoming release.
       communication: registration of dirty buffers is more efficient.
       communication: errors related to dirty buffers are more verbose
       cryptography: now using __SSE4_2__ provided by gcc -march=native
       Documentation: updated the author file
       structures/MyHashTable: added missing headers
       communication: show a warning when at least 64 buffers are dirty
       routing: added routing with a convex regular polytope
       MessageRouter: store the routing information in the buffer
       routing: don't write routes for the polytope surface (called hypercube)
       core: fixed a buffer allocation bug in the core
       communication: the real-time sweeper is better configured
       the upper bound for the number of sent messages is not m_size
       This is RayPlatform (the engine) v1.1.0-rc0 "Chariot of Complexity"
       ComputeCore: routed messages must be purged
       communication: introducing the CONFIG_COMM_IRECV_TESTANY model
       communication: non-blocking communication is bad on Blue Gene /Q
       This is the branch development version for RayPlatform v1.1.0-rc1
       RayPlatform v1.1.0


------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

[Denovoassembler-users] Release of Ray v2.1.0 (mostly bug fixes)

Reply via email to