Hi Adrian,

No, but sometimes the automatic detection of "outer distances" will fail if the 
seeds are too short.

In that case, you just need to provide the information manually.


On Assemblathon-2/Bird:

Contigs >= 100 nt
 Number: 88826
 Total length: 1169161521
 Average: 13162
 N50: 41098
 Median: 3368
 Largest: 465622
Contigs >= 500 nt
 Number: 68550
 Total length: 1164709611
 Average: 16990
 N50: 41306
 Median: 6862
 Largest: 465622
Scaffolds >= 100 nt
 Number: 47279
 Total length: 1270995781
 Average: 26882
 N50: 567125
 Median: 725
 Largest: 3236250
Scaffolds >= 500 nt
 Number: 27408
 Total length: 1266700501
 Average: 46216
 N50: 571612
 Median: 2137
 Largest: 3236250


                              Sébastien

             http://github.com/sebhtml/ray

> ________________________________________
> De : Adrian Platts [[email protected]]
> Date d'envoi : 3 octobre 2011 11:36
> À : Sébastien Boisvert
> Objet : Re: [Denovoassembler-users] Ray v1.7
>
> Hi Sebastien,
>
> Looking forward to trying 1.7.  One question - when inputting illumina MP 
> reads do you have any special command line option to tell the compiler these
> are mates and their orientation?  Or do we rev-comp the ends ourselves to 
> make them look like illumina PE orientations?
>
> Adrian
>
>
> On 2011-10-03, at 11:19 AM, Sébastien Boisvert wrote:
>
>> Dear assemblers,
>>
>>
>> Ray v1.7 is now available.
>>
>>
>> Summary of what changed:
>>
>> * MANUAL_PAGE.txt replaces the PDF manual.
>> * Output files are written to the directory specified by -o (previously it 
>> was a file prefix)
>> * Round-robin reception of messages
>> * Bloom filter
>> * Illumina mate-pairs support
>> * Job checkpointing
>> * New scaffolding algorithm
>> * New assembly engine for the extension of seeds with mate-pairs (NovaEngine)
>> * Parallel file partitionning
>> * Network latency testing
>> * Compiles cleanly on 32-bit systems
>>
>>
>>
>> All the changes:
>>
>> v1.7        Mon Oct 3 10:42:01 2011 -0400        1 commits
>>
>>       d28e76a Removed data files in unit tests.
>>
>> v1.7.0        Mon Oct 3 10:37:25 2011 -0400        6 commits
>>
>>       2943c44 Updated the manual page.
>>       1f712cd Fixed PATH problems in unit tests.
>>       ddc513c Simplified the release procedure.
>>       41f9603 Fixed some PATH issues in system tests.
>>       b388cfc Migrated the version in the Makefile.
>>       28b2a69 Added granularity summary for option -run-profiler.
>>
>> v1.7-rc2        Wed Sep 28 22:40:40 2011 -0400        7 commits
>>
>>       f49a434 Added the compilation option CONFIG_CLOCK_GETTIME for the 
>> profiler.
>>       95f2488 Remove expired reads from the list of unmated reads to reduce 
>> the computation granularity.
>>       9bf6e69 Reduced the granularity in call_RAY_SLAVE_MODE_EXTENSION() by 
>> cleaning expiration positions.
>>       d705e55 Reduced the computation granularity for the code that computes 
>> reverse-complement extensions.
>>       f76e565 Added timer warnings with -run-profiler.
>>       0ee6fcc Added comments in the communication layer of Ray.
>>       57099e3 Disabled persistent communication in the round-robin reception.
>>
>> v1.7.0-rc1        Mon Sep 26 16:59:17 2011 -0400        58 commits
>>
>>       f75bfa6 Removed inline code because compilers optimize the code anyway.
>>       6b345f9 Changed mpirun to mpiexec as mpiexec is in the standard.
>>       ec32cf4 Merged the persistent communication layer with the round-robin 
>> reception.
>>       714875d Implemented a round-robin algorithm for the reception of 
>> messages.
>>       63a3131 Write raw data for network tests if -test-network-only is 
>> provided.
>>       dc196b2 Added option -write-network-test-raw-data.
>>       fd3c033 Added a time period during which other messages are more 
>> important than urgent messages.
>>       7c2d6f6 Fixed a bug in the code that loads the checkpoint GenomeGraph.
>>       58733de Enabled the communication optimizer for the network test too.
>>       a08888b The option -show-communication-events now shows all messages 
>> with overlays too.
>>       c97ea8e Added a communication optimizer with urgent messages.
>>       b63035e Added overlays for option -show-communication-events.
>>       9779ced Added option -show-communication-events.
>>       7619e84 Added option -show-read-placement.
>>       d6b1f5d Added more details in the output of -run-profiler.
>>       6aa7b59 Removed dependency for clock_gettime.
>>       078148a Added option -debug-scaffolder.
>>       9181dd8 Added assertions and fixed a bug in GridTable.
>>       0df6381 Fixed some divisions by 0 in the scaffolder.
>>       0554af1 Regression bug on phix system test fixed.
>>       76b0b0d Fixed a bug in JoinerWorker in which two overlapping paths 
>> would not be joined together.
>>       bc7b314 Added some debugging information for -debug-fusions.
>>       783d522 Fixed a communication problem in MessageProcessor.
>>       4e2b599 The number of enabled MPI ranks can be changed during the 
>> network test by changing a variable in the source code.
>>       95d825d Merge branch 'master' of [email protected]:sebhtml/ray
>>       bdf49a2 Fixed an integer overflow in the computation of standard 
>> deviations.
>>       53eb012 Fixed scripts to accomodate new prefix directory option.
>>       2828d85 Added more debugging information in the scaffolding test.
>>       2afa3cf Updated for ScaffoldLinks.txt format to v2.0.
>>       5cd9848 Added some documentation for Infiniband.
>>       1dab4e3 Restored the default number of words in the network test to 
>> 500.
>>       21e312c Modified some scaffolding code to obtain the correct side of a 
>> contig when it allows both.
>>       a908567 Implemented a new greedy scaffolding algorithm as discussed 
>> with François Laviolette.
>>       81be6d9 Added the standard deviation in ScaffoldLinks.txt
>>       32646f2 Added non-persistent MPI communication just to compare.
>>       06c7695 Added information in ScaffoldLinks.txt
>>       f0fcdd3 Changed the default message size for network testing.
>>       eb9f8c7 Limiting scaffolding links to vertices that have one parent, 
>> one child and a coverage value near the peak.
>>       b5a98d2 RayVersion and RayCommand are now written (a bug was 
>> introduced).
>>       8dc6766 Fixed the code that counts the number of extended seeds.
>>       862b5fc Added option -write-contig-paths to write contig paths with 
>> coverage values. This is enabled by default.
>>       1db2865 The checkpoint ContigPaths is now fully operational (read and 
>> write).
>>       edbfcd7 The checkpoint ContigPaths is now written on demand.
>>       9150bc1 Removed the minimum number of raw scaffolding links.
>>       77e4a01 Modified the scaffolder routines to check the vertex coverage 
>> values in paths.
>>       18edfd0 Fixed the content of a displayed text in the fusion task 
>> creator.
>>       eaadc7c Modified the Sun Grid Engine job template to erase the 
>> directory before running the whole thing.
>>       672b5f3 Changed --oneline to --pretty=oneline for compatibility with 
>> older versions of git.
>>       fdad4e9 Cleaned some code in the task creator routines for edge 
>> purging.
>>       b9fe00b Cleaned some code in the task creator routines for edge 
>> purging.
>>       e6a4fd8 Fixed a bug in the virtual processor wherein it was not 
>> force-flushing messages when needed.
>>       32c1384 Corrected the number of flowed vertices in the seed extension.
>>       6538859 Improved heuristics for selection.
>>       221e7b4 Implemented the reverse strand case in JoinerWorker.
>>       2754a54 Added 3 unit tests for NovaEngine and improved the heuristics.
>>       f704b0f Corrected positions in JoinerWorker when on the other strand.
>>       ef06930 The default is now ASSERT=y in the Makefile.
>>       e77388e All output files are written in a directory provided with 
>> option -o.
>>
>> v1.7.0-beta1        Tue Sep 6 12:02:43 2011 -0400        50 commits
>>
>>       db6b8d3 reset() must be called in the constructor.
>>       d946b29 Fixed compilation warning.
>>       8200a28 Merge branch 'master' of github.com:sebhtml/ray
>>       78af599 Adding new files in Documentation/.
>>       c6fb5d0 Added INSTALL.txt.
>>       151028b Migrated some code only utilised by the scaffolder.
>>       345d267 Added \author tag to all classes.
>>       952e1a2 Updated Documentation files.
>>       068ce0e Added VirtualProcessor initialization.
>>       8a2be4a Removed MyForest and its iterator minion.
>>       4eebe61 Introducing the VirtualProcessor class.
>>       4897465 Fixed compilation warnings for 32-bit systems.
>>       91799e1 Fixed an argument name.
>>       1039128 Added documentation for the network latency.
>>       f11cb16 Added documentation for the virtual processor.
>>       8152dbb Added documentation for the virtual communicator.
>>       7e04a0a Fixed compilation warnings.
>>       e5f0ac7 Joiner software stack now joins otherwise un-joined paths in 
>> the distributed graph.
>>       e53c0cf Now printing hit information in JoinerWorker.
>>       3d49b71 Added selected hit in standard output for JoinerWorker.
>>       1ef78fb Updated a threshold in FusionWorker.
>>       9595b0d Added debugging information in JoinerWorker.
>>       441d133 Added Joiner code.
>>       b3b8dde Disabled the reverse-complement copies of extensions.
>>       e42c2d1 Workers push virtual messages, not real messages.
>>       80d8fc8 Fixed a state-machine bug in TaskCreator/FusionTaskCreator.
>>       ae4af42 Fixed a machine-state bug in FusionWorker.
>>       15fc7d1 Added an AUTHORS file.
>>       6a5954b Changed the default algorithm in VirtualProcessor -- now using 
>> a minimum work unit.
>>       3cbc5fd Added some debugging information for FusionTaskCreator.
>>       45280ff Removed OperatingSystem dependency in unit tests.
>>       f778c5b Implemented a new better and simpler merger module -- 
>> FusionTaskCreator/FusionWorker.
>>       57753e7 Fixed some unit tests by moving scaffolder methods.
>>       84656ae Using the VirtualProcessor for edge purge.
>>       c801302 Added some debugging information for fusions.
>>       53085bb Restored worker codes.
>>       d05a902 Added interface Worker for worker classes.
>>       3547950 Added method hasWorkToDo to VirtualProcessor.
>>       75f29f1 Added debugging messages in FusionData.
>>       b826e68 Added TaskCreator and Merger classes.
>>       173a2f3 Removed hard-coded parameter -debug-fusions.
>>       778e0d2 Changed the maximum number of cycles to 16 in merging code.
>>       10d3346 Merge branch 'master' of [email protected]:sebhtml/ray
>>       1048b06 Added scaffolder cases in Documentation/
>>       45ecca1 The ChangeLog file will not be maintained anymore, use 
>> ./scripts/dump-ChangeLog.sh
>>       b01fe11 Added option -version to Ray.
>>       58e1a6d Modified the behavior of Ray when fusions are generated.
>>       26ee554 Updated the path to Ray in system tests.
>>       c920e1e Fixed a compilation warning.
>>       7dea079 Added a function to create directories.
>>
>> v1.6.2-rc2        Wed Aug 24 20:50:07 2011 -0400        71 commits
>>
>>       70c8e92 Changed where is written the binary Ray.
>>       b14a5b7 Removed the manual target from the Makefile.
>>       a5ff5c4 Added a Documentation directory.
>>       6b6fd7c Removed logo from source.
>>       fa809fa Testing symbolic links.
>>       6a290ce Added additional debugging information.
>>       b6aa616 Restored original state.
>>       7f8708f Added an explicit flush.
>>       1a11ee5 Added checkpoint Sequences.
>>       ffef055 Updated the ouput of -help option.
>>       89ed32b Added option -debug-fusions.
>>       5e09388 Updated the MANUAL.
>>       797baef Added -read-write-checkpoints in the changes.
>>       d15a0ca Added gmane link in the README
>>       98dbbd4 Removed unused scripts.
>>       0034730 Skip a seed if within it during flow 1 and a vertex is already 
>> processed.
>>       37aa1ff Limiting seeds to probably unique vertices.
>>       03cb71f Don't write a checkpoint if it was just read.
>>       dcf3eb6 Added a file describing checkpoints.
>>       ee971f5 Read checkpoint before writing it.
>>       aec738a Changed 1 hash function because it was a copy.
>>       6c4fac9 Fixed a hanging problem.
>>       0094fba Added checkpoint Extensions.
>>       5dd5470 Added checkpoint Partition.
>>       ddf2125 Fixed a bug when no sequence files are provided.
>>       5a7b56e Improved checkpointing message.
>>       a6596ea Added option -test-network-only to only to test the network 
>> and return.
>>       a72d0ec Improved checkpointing messages.
>>       6d29d43 Fixed a messaging bug occuring very rarely.
>>       e2a2bb6 Added a MANUAL file.
>>       218546f Checkpoint files are now written in a binary format.
>>       f63590a Checkpoints are now operational.
>>       adf7222 -read-checkpoints works with checkpoints 
>> <CoverageDistribution>, <GenomeGraph> and <Seeds>.
>>       890ba11 Option -write-checkpoints writes all checkpoints.
>>       4822527 Added options -read-checkpoints and -write-checkpoints, this 
>> is still in development.
>>       23f7218 Preparing code for a change.
>>       60d1ebe Reduced the number of messages with tag 
>> RAY_MPI_TAG_REQUEST_VERTEX_COVERAGE in SeedExtender.cpp
>>       c510d96 Added tag counts for option -run-profiler.
>>       63beff4 Fixed a display problem.
>>       1e3490f Modified the order of the steps performed when merging 
>> identical paths.
>>       5fc8c13 Changed the prototype of 
>> VirtualCommunicator::getMessageResponseElements.
>>       237339a Only send a RAY_MPI_TAG_ASK_IS_ASSEMBLED message if starting 
>> on a seed on flow 1.
>>       6fa20a9 Don't fetch read markers when not needed, use less memory to 
>> know is a vertex was assembled.
>>       7c2a084 Added skipping events.
>>       d4e5a25 Fixed N50 when there is only 1 scaffold.
>>       c17c1fd RayCommands file is written correctly now.
>>       3ece690 Ray merger will merge more things now.
>>       b6e342c Fixed a segmentation fault that occurs in rare cases.
>>       6208303 Fixed a bug in the scaffolder, now more vertices should be 
>> investigated.
>>       eb13301 Changed the precision of things that go together.
>>       0f78f4e Fixed which arguments are picked up by opcodes -p and -i.
>>       dd5c796 Input opcodes are now shuffled before being utilised.
>>       44d2bc8 Extension of seeds if done endlessly until growth stops.
>>       ac01c27 Modified NovaEngine to pass a new unit test (as well as the 
>> old unit tests).
>>       f23b52d Changed the default number of persistent requests.
>>       fe05cee Added list of working C++ compilers.
>>       6a1854a The paired read simulator is now a separate project, see 
>> https://github.com/sebhtml/paired-read-simulator
>>       6ccac7a Flag invalid choices before doing the actual selection.
>>       82f6107 Fixed a compilation Warning.
>>       5a99139 Fixed a integer overflow.
>>       c63ba4a Fixed a compilation warning with gcc.
>>       0d1aab0 Fixed a compilation warning with Intel compiler.
>>       f84d308 ExtensionElement objects now contains reads in 2-bit format.
>>       ec0f6d1 Fixed a memory problem in the computation of optimal read 
>> markers.
>>       5a7c941 Implemented a Bloom filter to reduce the memory usage. This is 
>> ridiculously good and the false positive rate has no effect whatsoever on 
>> Ray thanks to the KmerAcademy.
>>       27f24d5 Added a few things in the README.
>>       a1233c0 Fixed a recently introduced regression in heuristics (should 
>> not choose an invalid choice).
>>       ed67f25 Updates in the README.
>>       83f0e0a Fixed the maximum length of input reads to 65535.
>>       dbba521 Added some documentation files.
>>       40d1d8a Fixed a bug.
>>
>> v1.6.2-rc1        Wed Aug 3 13:39:46 2011 -0400        87 commits
>>
>>       c9aa372 fixing a segfault when no contigs are found
>>       e6c66af Added N50, median, average and largest contig and scaffold 
>> lengths in PREFIX.OutputNumbers.txt
>>       1aac7a6 Removed the coverage threshold from the algorithm that finds 
>> seeds in the distributed graph. Suggested by David Eccles(gringer) from 
>> Max-Planck-Gesellschaft, München.
>>       a6244b0 Modified the selection engine to that the new unit tests also 
>> pass.
>>       0811707 Removed email of contributor.
>>       91f3251 Added David Eccles in the README
>>       798f262  Added debugging option -show-distance-summary.
>>       34d1af2 New development option: -show-distance-summary.
>>       454d5ee Added a test for open addressing.
>>       bffce62 Merging of similar paths has been modified.
>>       3b83d6c Added read placement freezing.
>>       dac7de5 Modified the extension algorithm to avoid collapsing of 
>> repeated k-mers that are near each other in the genome.
>>       c667133 Added a unit test and fixed NovaEngine to handle it.
>>       7a3443c Improved the manual.
>>       d842360 Added a section on how to launch Ray in the manual.
>>       9c449be Fixed a compilation warning.
>>       182fe2c Added some unit tests for the Ray NovaEngine (for mate-pair 
>> reads). Seems to work quite well so far.
>>       a5e631b New option: -write-seeds which is useful for debugging the 
>> code.
>>       1902636 Only use NovaEngine when paired information is available.
>>       91359b0 Added options -use-NovaEngine and -show-NovaEngine for 
>> debugging purposes.
>>       8bc58aa Fixed compilation errors with HAVE_LIBZ=y and HAVE_LIBBZ2=y
>>       af7557e New output files: PREFIX.SequencePartition.txt and 
>> PREFIX.NumberOfSequences. Improved the content shown with -help.
>>       9598fda Now using the NovaEngine.
>>       ddeca62 Fixed a bug in the NovaEngine.
>>       13e5e76 Removed by default the reporting of libraries in stdout.
>>       5e5edae Option use:NovaEngine enables experimental NovaEngine.
>>       73c61bd Fixed a bug in the recently introduced peer-to-peer parallel 
>> Partitioner.
>>       0bbd8d3 Removed options in 2 system tests.
>>       1c53c6e Added comments at random places.
>>       3696be0 Added 2 scripts for code editing.
>>       1c135af Disabling by default the experimental NovaEngine. Results so 
>> far are promising !
>>       e1efa6e Fixed a compilation warning.
>>       3fe9fab Removed roughly half of the messages with the MPI tag 
>> RAY_MPI_TAG_KMER_ACADEMY_DATA.
>>       4b3639f Added a unit test and modified CoverageDistribution.cpp to 
>> handle low-coverage datasets.
>>       b76e9dd * Added information on unknown nucleotides in the instruction 
>> manual.
>>       9847b9e Added a TODO item.
>>       20ad18f Added an abstraction layer for the operating system.
>>       9706603 Improved the README for system tests.
>>       4372845 Only show nova choices if -show-extension-choice is provided.
>>       67a87ed Improved the document about patching.
>>       231dbc9 Added a file describing how to submit a patch.
>>       88b13a0 Unit tests are now files named test_<test_name>.sh
>>       43200b5 Fixed a typo.
>>       efda9d9 Moved Kmer routines in the class Kmer.
>>       207a193 Added an entry in the changelog.
>>       19b3905 Added a symbolic link.
>>       f7addb4 Added 35 unit tests.
>>       e704aa1 Simplified the algorithm that finds peak and added 35 unit 
>> tests to test it (for various datasets).
>>       1a8b0e0 Improved the NovaEngine according to unit tests.
>>       412052b Improved the unit tests for the NovaEngine.
>>       862e6dd Removed some messages.
>>       1bd4d5a Removed assertion.
>>       4471c37 Improved the NovaEngine, but not using it yet. It needs more 
>> testing.
>>       668ad2f Added a unit test for the NovaEngine.
>>       9c8b02d New experimental heuristics: The Ray NovaEngine.
>>       1ef868d Created an heuristics module and moved related bits in it.
>>       cd8c1af Improved the peak finder when there is no deviation.
>>       15ea481 Fixed a compilation error.
>>       a2470f2 Moved the configuration of the virtual communicator in 
>> Machine.cpp
>>       810187d Corrected a code comment.
>>       982ff04 Updated the coding style.
>>       f7d9087 Added a coding style file.
>>       3f15314 Don't compute or update peaks for libraries with 
>> manually-provided information.
>>       d386173 When the extension is finished, show library peak usage.
>>       952dfb8 Don't print the tree.
>>       5a49add Changed to 64 slots.
>>       8c15d79 Corrected the peak finder.
>>       fdc1bde Merge branch 'master' of [email protected]:sebhtml/ray
>>       2b5f716 Changed the behavior for repeats.
>>       0d6565e Fixed a system test.
>>       857c207 Fixed some compilation warning with gcc 4.1.2.
>>       4bb20ba Added an entry in the change log for 1.6.2.
>>       5fab076 Now working on v1.6.2
>>       8354455 Fixed a compilation errors due to the algorithm library.
>>       8ffee2b Merge branch 'master' of github.com:sebhtml/ray
>>       29ab020 Fixed a bug in the incremental resizing algorithm of 
>> MyHashTable.
>>       bacc0c7 Fixed comments and assertions for new correct code.
>>       d48fa33 Fixed an implementation bug for double hashing.
>>       c9e8a5e Added a TODO item.
>>       8549758 File partitioning is now performed in parallel.
>>       9825ecf Implemented parallel file partitioning.
>>       4a4b739 Don't set NSLOTS if already defined.
>>       d652a1c Now Ray uses the maximum peak of a paired library to compute 
>> the expiry position of a read.
>>       065df3a Choosing the good peak for a paired library if a mate is 
>> already available.
>>       bd324ea Now selecting the correct peak to choose the next vertex.
>>       7a72aac Ray can find more than one peak in any paired library.
>>       7d92f06 Ported the prototype for finding peaks from Python to C++.
>>
>>
>>                                                     Sébastien
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense.
>> http://p.sf.net/sfu/splunk-d2dcopy1
>> _______________________________________________
>> Denovoassembler-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/denovoassembler-users
>
>

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users

Reply via email to