Hello, Ray v2.0.0 was released on 2012-06-22. It is time to release Ray v2.1.0 !
It is available directly at http://sourceforge.net/projects/denovoassembler/files/Ray-v2.1.0.tar.bz2 Documentation was added for the metagenomics solutions called 'Ray Méta', 'Ray Communities', and 'Ray Ontologies' that are implemented in Ray plugins. Changes in bioinformatics algorithm implementations: Changes include a new data reliability option, options to control the maximum (or minimum) accepted k-mer coverage, a fix for a race condition in the plugin that colors the graph, new options for the storage engine, faster network tests, fixes for input files compressed with bunzip2, ability to disable scaffolding, various portability fixes, patches for twin k-mers (efficient storage), faster building of the distributed graph, Changes in the runtime engine: The distributed storage backend was optimized, added hardware acceleration with pop count when available, new registration system for plugins, bug fixes in the hash table, default communication model is now MPI_Iprobe / MPI_ANY_SOURCE, new routines for dirty buffer management, polytope communication graph. Full list: --- Changes between Ray v2.0.0 and Ray v2.1.0: 100 files changed, 4294 insertions(+), 2398 deletions(-) Pier-Luc Plante (3): Scaffolder is not required when using unpaired reads. Patch Koala: Added an option (-use-maximum-seed-coverage) so that higly-covered seeds can be ignored. Corrected the tet that determines the quality control results. There was too much false negatives. The returned value is more reliable now. Sébastien Boisvert (142): The copyright was updated to add 2012. When there are 508 reads and 32 MPI ranks, the number of reads per rank is 508/32= 15. Therefore, assuming a perfect division read number 495 would be on MPI rank 33 (495/15 = 33). This makes Ray crash. This change set corrects this. A list of releases was added. The codename of the next release will be "Ancient Granularity of Epochs". An assertion was added for the performance scaled messaging related bug. Two assertions were added to detect possible message corruption. The help page was update to add the data reliability option. Signed-off-by: Sébastien Boisvert <sebastien.boisver...@ulaval.ca> The peak finder was modified to pass new tests. I edited the guide to submit changes. The manual now includes the new option for overly-covered seeds. A error was fixed in the file that says how to submit changes. The return statement was misplaced in a recent patch. I added the names 'Ray Méta', 'Ray Communities', and 'Ray Ontologies'. An assertion was added to make sure that data is not overwritten. Searcher: added verbose statements Searcher: fixed a race condition Searcher: added a missing value. SeedExtender: moved system calls inside this plugin SeedExtender: modified the code for hot skipping SeedExtender: implemented hot skipping Parameters: 4 options were added to change distributed storage behavior. Documentation: Ray can be run with a single configuration file containing options. The default load factor threshold was changed to 0.75. The methods setKey() and getKey() were added to KmerCandidate and Vertex classes for compatibility with MyHashTable. If the hash table is verbose, ask it to display its status. NetworkTest: added the option -skip-network-test to skip the network test. Added a new option to enable genome neighbourhood calculation. The option is -find-neighbourhoods I added some code to detect windows 32 bits and windows 64 bits. More parameters for compilation can be provided with EXTRA=... Porting Ray to the new RayPlatform: removed macro calls in .h files. Porting Ray to the new RayPlatform: removed remaining codes in .h. Porting Ray to the new RayPlatform: removed token 'generated_automatically'. Porting Ray to the new RayPlatform: added CreatePlugin and BindPlugin instructions. Porting Ray to the new RayPlatform: updated the macro names in C++ plugin files. Porting Ray to the new RayPlatform: removed adapter from plugin class definitions. Porting Ray to the new RayPlatform: remove calls to setObject. Porting Ray to the new RayPlatform: Ray compiles with the simplified RayPlatform adapters now. I removed handlers from the cmake file. Updating the manual. SeedExtender: changed the verbosity period. Removed some output from the computation of seeds. The manual was updated to include pointers to documentation. If you run Ray with a configuration file (mpiexec -n 4 Ray Ray.conf) you can start comments with the '#' symbol like in python. Information to compile Ray with gcc was added. The default number of buckets is now 1048576. The default number of buckets per group is still 64, so that is only 16384 groups with almost no memory usage because it is sparse. This fixes a input/output bug for the Ray configuration file. The code that randomizes the arguments was removed because it can lead to bugs. This also simplifies checkpointing. The edge purging should be done in a massively parallel way unless the option -write-kmers was provided. Merge branch 'master' of https://github.com/plpla/ray into pl I added a script to build Ray with link time optimization. The EXTRA commands are also given to the linking command. I added -fwhole-program for better optimization. I added compilation flags for compression. I added instructions to build Ray with link time optimization. NetworkTest: the number of test messages is now constant regardless of the number of MPI ranks in the communicator. application_core: added a call to obtain a string configuration token. KmerAcademyBuilder: option -bloom-filter-bits can sets the number of bits. KmerAcademyBuilder: Bloom filter has 64 M bits by default. Merge branch 'master' of github.com:sebhtml/ray Merge branch 'master' of github.com:sebhtml/ray SequencesLoader: added a 'please wait' before counting entries in a file. SequencesLoader: a bz2 file can contain many compressed streams. Each of them needs to be opened, read (until BZ_STREAM_END), and closed. application_core: bugs were fixed in the configuration routines. GeneOntology: removed the use of argv Merge branch 'master' of github.com:sebhtml/ray Merge branch 'master' of github.com:sebhtml/ray Fixed an integer overflow in the distributed storage engine. A path with 0 k-mers has 0 nucleotides, not 0-k+1. Merge branch 'master' of github.com:sebhtml/ray A new routing graph is available: the hypercube. Documentation: documented the hypercube features of Ray. core: the default number of buckets is now 268435456 per rank. scaffolder: it can be disabled with -disable-scaffolder normalized option names with -enable-* and -disable-* documentation: moved assembly options up core: added documentation for class Parameters. SeedingData: -use-minimum-seed-coverage changes the minimum documentation: added missing operands in the manual and -help page core: Ray -version provides more compile flags like popcnt and sse SeedingData: seeds can not contain k-mers with too low coverage build: the C++ standard is C++ 1998. gcc -ansi provides that Searcher: large integer constants needs ULL for portability SeedExtender: added additional information for an error MessageProcessor: k-mer data messages should never be discarded VerticesExtractor: don't flush while waiting for messages KmerAcademyBuilder: only send the forward k-mer, not the lower VerticesExtractor: improved the code quality for easier reading MessageProcessor: don't discard k-mers while receiving messages VerticesExtractor: store twin edges in a single source EdgePurger: any edge is removed only if a end is not in the graph MessageProcessor: removed a call to a private attribute Documentation: added a document about profiling Ray Documentation: added information about elapsed time BuildSystem: added a strip command to reduce the memory footprint BuildSystem: replaced -ansi with -std=c++98 for more verbosity Documentation: updated the author file KmerAcademyBuilder: removed the k-mer academy VerticesExtractor: this module extracts vertices to add edges Merge branch 'kill-kmer-academy' MessageProcessor: new text to show when the Bloom filter is created KmerAcademyBuilder: added the number of set bits in the Bloom filter MessageProcessor: added a warning when the oracle is half full KmerAcademyBuilder: the Bloom filter can have any number of bits Merge branch 'bloom-features' MessageProcessor: coverage depth starts at 1 with Bloom filters MessageProcessor: the thresold is 50.0 (50.0%), not 0.5 KmerAcademyBuilder: added the number of filtered k-mers Merge branch 'bug-hunting' application_core: added routing with a convex regular polytope NetworkTest: the number of exchange can be changed with -exchanges Documentation: added options for a 64-rank polytope Documentation: updated the taxonomy documentation NetworkTest: added average round trip latency scripts: initial version of a script to create NCBI taxonomy scripts: download NCBI bacterial genomes too Merge branch 'master' of github.com:sebhtml/ray Documentation: added documentation for NCBI taxonomy Documentation: simplified the usage of the tool to pull NCBI data Signed-off-by: Sébastien Boisvert <sebastien.boisver...@ulaval.ca> scripts: the script that pulls NCBI data is almost ready scripts: the script that pulls NCBI stuff is ready Documentation: added information about XML files Partitioner: also create a file FilePartition.txt MachineHelper: don't run the AMOS code path if not necessary Parameters: throw a warning when distances are invalid Merge branch 'for-seb-September-2012' Searcher: fixed a race condition where a message was lost Calls to deprecated methods were eliminated. This is Ray v2.1.0-rc0 "Ancient Granularity of Epochs" Searcher: browsing the distributed colored de Bruijn subgraph Searcher: find or create a virtual color from physical colors Searcher: added physical color in SequenceAbundances.xml Searcher: fixed assertion code scripts: don't ship the example and only ship the bz2 distribution SequencesLoader: fixed the scope of a buffer Searcher: removed debug messages from stable release Documentation: added more documentation for gene ontology. Searcher: fixed buffer overflow Searcher: fixed compilation warnings Searcher: GraphBrowsing.xml needs -one-color-per-file This is the branch for Ray v2.1.0-rc1 Related git repositories were added in the README. Ray v2.1.0 --- Changes between RayPlatform v1.0.3 and RayPlatform v1.1.0: 52 files changed, 3215 insertions(+), 1244 deletions(-) Sébastien Boisvert (58): A release list was added. Message checksum are calculated by default for any non-empty message by RayPlatform. The option -verify-message-integrity must be provided to enable message integrity verification in RayPlatform. By default, the checksum is calculated by the software. A integer comparison was fixed. I implemented a system of annotation for buffers. With this, RayPlatform knows which buffer is dirty (possibly available, but maybe not) and which buffer is available. I fixed a typographical error in the documentation. I added a comment for dirty buffers. Because MPI_Request objects are usually "completed" before the message is actually on the destination, I don't think the RayPlatform virtual machine is going to run out of non-dirty buffer. The latency on a IBM iDataPlex (guillimin at McGill) for a Ray job of 36 cores was reduced from 23 to 17 microseconds (back and forth). I cleaned the persistent communication code. Merge branch 'master' of github.com:sebhtml/RayPlatform The three communication models were documented in the source code. The three models are: The constructor of the hash table now takes the number of buckets, the number of buckets per group, and load factor threshold as well as the verbosity. structures: increased portability of the hash table code. The class for hash table groups was moved to its own file. This fixes a bug introduced while working on the portability. The table prints its status after completion of the resizing, when in verbose mode. I added David Weese of Free University of Berlin in the code as he reviewed the hash table code. structure: using compiler builtins for some processing in the hash table. The specific code was moved inside one portable method. I added some comments in the ring allocator. Status is not printed if verbosity is not enabled. The registration system for plugins was changed. Now it uses function pointers instead of virtual methods, which can be slow as they can not be inlined. I added MessageWarden in the README. I added some documentation for handlers. Some more documentation was added. This fixes a bug in the insert() operation of the hash table during incremental resizing. h1 must return something between 0 and M-1 whereas h2 must return something odd between 1 and M-1. This was fixed in the code. The hash table also prints memory allocation information when printing its status. communication: switched the model to MPI_ANY_SOURCE. Added routines to clean dirty buffers when they are all dirty. A new routing graph is available: it is the hypercube. The hypercube prints its status before the end. routing: added status code for hypercube. communication: improved the last step in routing. routing: started to implement a round-robin policy for hypercube routing. routing: the round-robin hypercube is available in the code. routing: the hypercube can be modified to be a pseudo-hypercube communitation: increased the number of buffers for messaging communication: removed a useless line in the code Updated the code name for the upcoming release. communication: registration of dirty buffers is more efficient. communication: errors related to dirty buffers are more verbose cryptography: now using __SSE4_2__ provided by gcc -march=native Documentation: updated the author file structures/MyHashTable: added missing headers communication: show a warning when at least 64 buffers are dirty routing: added routing with a convex regular polytope MessageRouter: store the routing information in the buffer routing: don't write routes for the polytope surface (called hypercube) core: fixed a buffer allocation bug in the core communication: the real-time sweeper is better configured the upper bound for the number of sent messages is not m_size This is RayPlatform (the engine) v1.1.0-rc0 "Chariot of Complexity" ComputeCore: routed messages must be purged communication: introducing the CONFIG_COMM_IRECV_TESTANY model communication: non-blocking communication is bad on Blue Gene /Q This is the branch development version for RayPlatform v1.1.0-rc1 RayPlatform v1.1.0 ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Denovoassembler-users mailing list Denovoassembler-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/denovoassembler-users