Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-08-10 Thread Øyvind A . Holm
On 30 June 2014 14:56, Jakub Narębski jna...@gmail.com wrote: Linus Torvalds wrote: .. even there, there's another issue. With enough memory, the diff itself should be fairly reasonable to do, but we do not have any sane *format* for diffing those kinds of things. The regular textual

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-30 Thread Jakub Narębski
Linus Torvalds wrote: On Fri, Jun 27, 2014 at 10:48 AM, Junio C Hamano gits...@pobox.com wrote: Even though the original question mentioned delta discovery, I think what was being asked is not delta in the Git sense (which your answer is about) but is can we diff two long sequences of text

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-28 Thread Jarrad Hope
Thank-you all for replying, It's just as Jason suggests - Genbank, FASTA EMBL are more or less the defacto standards, I suspect FASTA will be phased out because (to my knowledge) it does not support gene annotation, nevertheless, they are all text based. These formats usually insert linebreaks

Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Jarrad Hope
Hello, As a software developer I've used git for years and have found it the perfect solution for source control. Lately I have found myself using git in a unique use-case - modifying DNA/RNA sequences and storing them in git, which are essentially software/source code for cells/life. For

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Shawn Pearce
On Fri, Jun 27, 2014 at 1:45 AM, Jarrad Hope m...@jarradhope.com wrote: As a software developer I've used git for years and have found it the perfect solution for source control. Lately I have found myself using git in a unique use-case - modifying DNA/RNA sequences and storing them in git,

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Junio C Hamano
Shawn Pearce spea...@spearce.org writes: Git does source code well. I don't know enough to judge if DNA/RNA sequence storage is similar enough to source code to benefit from things like `git log -p` showing deltas over time, or if some other algorithm would be more effective. From my

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Linus Torvalds
On Fri, Jun 27, 2014 at 10:48 AM, Junio C Hamano gits...@pobox.com wrote: Even though the original question mentioned delta discovery, I think what was being asked is not delta in the Git sense (which your answer is about) but is can we diff two long sequences of text (that happens to consist

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Linus Torvalds
On Fri, Jun 27, 2014 at 12:38 PM, Linus Torvalds torva...@linux-foundation.org wrote: I think it might be possible to just specify a special diff algorithm (git already supports that, obviously), and just introduce a new use binary diffs with a textual representation model. Another model

RE: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Jason Pyeron
-Original Message- From: Linus Torvalds Sent: Friday, June 27, 2014 15:39 On Fri, Jun 27, 2014 at 10:48 AM, Junio C Hamano gits...@pobox.com wrote: Even though the original question mentioned delta discovery, I think what was being asked is not delta in the Git sense (which

Re: Tackling Git Limitations with Singular Large Line-seperated Plaintext files

2014-06-27 Thread Linus Torvalds
On Fri, Jun 27, 2014 at 12:55 PM, Jason Pyeron jpye...@pdinc.us wrote: The issue will be, if we talk about changes other than same length substitutions (e.g. Down's Syndrome where it has an insertion of code) would require one code per line for the diffs to work nicely. Not my area of