Here's an example to show how the file format works.
$ cat version1
pseudocode bla {
print("This is version 1");
String x = "hello";
print(x);
}
$ cat version2
pseudocode bla {
print("This is version 2");
print("Starting now");
String x = "hello";
print(x);
}
$ linecomp c test.lc version1 version2
Adding file version1
Adding file version2
Compressing a total of 192 bytes...
Compression done [12 ms]
Archive /home/stefan/linecomp-demo/test.lc stats:
1K of text compressed into 1K (2 files)
$ gunzip <test.lc
LINECOMP 8
String x = "hello";
print("Starting now");
print("This is version 1");
print("This is version 2");
print(x);
pseudocode bla {
}
1 5
7 0
8 9
version1=6 3 10
version2=6 4 2 0 10
As you see, the file consists of these 4 parts:
1. Magic header and number of literal lines that follow (n)
2. All the literal lines in lexical order
3. A list of pairs of indices. Each index points either into the literal table
(if i < n) or in the table of pairs itself (if i >= n). This way we build
larger chunks of lines.
4. A list of the files with the file's name plus a list of indices as defined
above which are followed and concatenated to yield the file's contents.
Don't tell me that's not one of the simplest formats you have ever seen.
---
[An excursion: Can binary data be compressed this way? You need to define how
to split it in "lines".
I believe you can actually coerce binary data into LINECOMP 0.1. It will still
look for line breaks in the data which might be a rather nonsensical exercise,
but nonetheless it should work. Input is not specifically interpreted as text
except for the \n's. One might define a different content splitting strategy
for other file types and actually get usable results there too.]
------------------------------------------
Artificial General Intelligence List: AGI
Permalink:
https://agi.topicbox.com/groups/agi/Tb2cf064c700f181c-Ma6d39bd7be49108424be06e4
Delivery options: https://agi.topicbox.com/groups/agi/subscription