Re: [fossil-users] Unversioned files.

2016-09-11 Thread Adam Jensen
On 09/11/2016 05:30 PM, Stephan Beal wrote:
> And i would argue against it as falling well out of scope for an SCM ;)

I'm okay with that.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Adam Jensen
On 09/11/2016 06:38 PM, Scott Robison wrote:
> Of course, adding differentiation and specialization increases
> complexity, so it can be a tricky balancing act. 

We, as a species, have already gone down that rabbit hole.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Scott Robison
On Sun, Sep 11, 2016 at 3:27 PM, Adam Jensen  wrote:

> On 09/11/2016 04:42 PM, Scott Robison wrote:
> > I may not be understanding you, but from my point of view, it already
> > does what you want by supporting versioned files that you simply never
> > change. For example, you could have a repo that has a structure along
> > the lines of:
> >
> > /root/static-data/ -> a place to store lots of big binary blobs that you
> > don't intent to ever modify.
> >
> > /root/dynamic-data/ -> a place to store things that you want to track
> > the history for
> >
> > Is this inadequate? If so, how or why?
>
> My impression is that with a different class of files, the operations,
> interface, and underlying implementation can be specialized for the
> qualities, characteristics, and various needs that are peculiar to that
> class. For example, an unversioned file might avoid some of the CPU,
> memory, and/or storage requirements that are involved in versioned
> files. Also, I imagine there might be some specialized commands and
> alerts associated with the two major use-cases for unversioned files
> (critical immutable data, and temporary intermediate data).
>

You are generally correct that a specialized solution can often improve on
the generalized solution by some metric. There are ways to do things with
huge files (such as diff them) that wouldn't strictly require as much
memory as the general case does, but the utility is marginal at best for
fossil's design case. Very few people would have a need to compute deltas
for a terabyte sized file, so it is not implemented. Not that you were
suggesting someone should be able to do that, I was just going for the
first very ridiculous idea that came to mind.

In any case, it might be possible to make the unversioned functionality
"better" in some way, but it seems like a less than ideal use of time IMO.
Fossil already handles the critical file tracking for versioned files, just
don't overwrite them. And if at some point you discover that the critical
data really is wrong or outdated, you can commit a new version over the old
one. In essence, the unversioned critical data functionality you've
envisioned works off the assumption that you can see into the future and
guarantee you'll never change your mind about the utility of versioning for
these files. YMMV

As SB said, it's just an opinion, and not one that drives the project. I
just wanted to try to understand what you were hoping to gain from it, so
thanks for answering my questions.


>
> Ultimately, I suppose the purpose of a tool like Fossil is to assist in
> the management of complexity. Differentiation and specialization seems
> like a fundamental part of that. But yeah, my tests all currently make
> use of the versioned file class for storage.


Of course, adding differentiation and specialization increases complexity,
so it can be a tricky balancing act.
-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Stephan Beal
On Sep 11, 2016 21:49, "Adam Jensen"  wrote:
>
> On 09/11/2016 01:54 PM, Stephan Beal wrote:
> > On Sep 11, 2016 18:18, "Adam Jensen"  > > wrote:
> [snip]
> >> '''
> >> 5.4 Unversioned File Sync
> >>
> >> "Unversioned files" are files held in the repository where only the
most
> >> recent version of the file is kept rather than the entire change
> >> history. Unversioned files are intended to be used to store ephemeral
> >> content, such as compiled binaries of the most recent release.
> >> '''
> >>
> >> The phrase "ephemeral content" is a bit disconcerting. It suggests
> >> values and attitudes towards this data which will probably be reflected
> >> in the requirements, specification, and implementation of the software.
> >>
> >> In the use-case I have in mind, this data would be "immutable content"
> >> and should be considered precious.
> >
> > That's not, as i understand it, the intention of unversioned filed.
> > Anything "important" needs to be checked in (versioned). Unversioned
> > files are primarily intended for hosting pre-built binaries and such.
>
> We might not be intersecting on this point; a perspective difference, I
> suspect. What I am suggesting is that the current intention of
> unversioned files might need be slightly tweaked to encourage an
> additional use-case where the repository supports the organization and
> management of critical data (immutable; a snap-shot of reality; large
> binary files) in addition to highly revised text files such as data
> analysis scripts, annotations, and documentation.

And i would argue against it as falling well out of scope for an SCM ;).
(Not that my opinion on the topic matters. ;) Unversioned files were, as i
understand it (possibly incorrectly), added primarily as a convenience for
a nearly-universal need/use case: hosting pre-built copies of sources held
in the SCM. Stretching that to cover a wide range of cases (some arguably
better-suited to scalable cloud infrastructure) sounds unnecessary to me.
But that's just me, and i historically tend to hold minority opinions, so
take what i say with a grain of salt.

- stephan
(Sent from a mobile device, possibly from bed. Please excuse brevity,
typos, and top-posting.)
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Adam Jensen
On 09/11/2016 04:42 PM, Scott Robison wrote:
> I may not be understanding you, but from my point of view, it already
> does what you want by supporting versioned files that you simply never
> change. For example, you could have a repo that has a structure along
> the lines of:
> 
> /root/static-data/ -> a place to store lots of big binary blobs that you
> don't intent to ever modify.
> 
> /root/dynamic-data/ -> a place to store things that you want to track
> the history for
> 
> Is this inadequate? If so, how or why?

My impression is that with a different class of files, the operations,
interface, and underlying implementation can be specialized for the
qualities, characteristics, and various needs that are peculiar to that
class. For example, an unversioned file might avoid some of the CPU,
memory, and/or storage requirements that are involved in versioned
files. Also, I imagine there might be some specialized commands and
alerts associated with the two major use-cases for unversioned files
(critical immutable data, and temporary intermediate data).

Ultimately, I suppose the purpose of a tool like Fossil is to assist in
the management of complexity. Differentiation and specialization seems
like a fundamental part of that. But yeah, my tests all currently make
use of the versioned file class for storage.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Scott Robison
On Sun, Sep 11, 2016 at 1:49 PM, Adam Jensen  wrote:

> On 09/11/2016 01:54 PM, Stephan Beal wrote:
> > On Sep 11, 2016 18:18, "Adam Jensen"  > > wrote:
> [snip]
> >> '''
> >> 5.4 Unversioned File Sync
> >>
> >> "Unversioned files" are files held in the repository where only the most
> >> recent version of the file is kept rather than the entire change
> >> history. Unversioned files are intended to be used to store ephemeral
> >> content, such as compiled binaries of the most recent release.
> >> '''
> >>
> >> The phrase "ephemeral content" is a bit disconcerting. It suggests
> >> values and attitudes towards this data which will probably be reflected
> >> in the requirements, specification, and implementation of the software.
> >>
> >> In the use-case I have in mind, this data would be "immutable content"
> >> and should be considered precious.
> >
> > That's not, as i understand it, the intention of unversioned filed.
> > Anything "important" needs to be checked in (versioned). Unversioned
> > files are primarily intended for hosting pre-built binaries and such.
>
> We might not be intersecting on this point; a perspective difference, I
> suspect. What I am suggesting is that the current intention of
> unversioned files might need be slightly tweaked to encourage an
> additional use-case where the repository supports the organization and
> management of critical data (immutable; a snap-shot of reality; large
> binary files) in addition to highly revised text files such as data
> analysis scripts, annotations, and documentation. Use of the unversioned
> files facility for storage and management of fleeting, intermediate
> files is also valuable. Quoting Chancellor Palpatine, I suggest we
> "embrace...a larger view" of unversioned files.
>
> On the other hand, it is still unclear [to me] if Fossil can be used
> this way. Once more complete unversioned files functionality is
> available, I can answer some of my questions through tests of the
> system. The critical points of interest are:
>
> 1. Maximum single unversioned file size.
> 2. Repository performance and integrity as overall size increases.
> 3. FuseFS performance.
> 4. Sync performance/reliability/usability.
>
> Once I have a better idea of these various characteristics and
> constraints, then it should be possible to estimate where Fossil might
> be a reasonable solution and where some other approach is needed.
>

I may not be understanding you, but from my point of view, it already does
what you want by supporting versioned files that you simply never change.
For example, you could have a repo that has a structure along the lines of:

/root/static-data/ -> a place to store lots of big binary blobs that you
don't intent to ever modify.

/root/dynamic-data/ -> a place to store things that you want to track the
history for

Is this inadequate? If so, how or why?

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Adam Jensen
On 09/11/2016 01:54 PM, Stephan Beal wrote:
> On Sep 11, 2016 18:18, "Adam Jensen"  > wrote:
[snip]
>> '''
>> 5.4 Unversioned File Sync
>>
>> "Unversioned files" are files held in the repository where only the most
>> recent version of the file is kept rather than the entire change
>> history. Unversioned files are intended to be used to store ephemeral
>> content, such as compiled binaries of the most recent release.
>> '''
>>
>> The phrase "ephemeral content" is a bit disconcerting. It suggests
>> values and attitudes towards this data which will probably be reflected
>> in the requirements, specification, and implementation of the software.
>>
>> In the use-case I have in mind, this data would be "immutable content"
>> and should be considered precious.
> 
> That's not, as i understand it, the intention of unversioned filed.
> Anything "important" needs to be checked in (versioned). Unversioned
> files are primarily intended for hosting pre-built binaries and such.

We might not be intersecting on this point; a perspective difference, I
suspect. What I am suggesting is that the current intention of
unversioned files might need be slightly tweaked to encourage an
additional use-case where the repository supports the organization and
management of critical data (immutable; a snap-shot of reality; large
binary files) in addition to highly revised text files such as data
analysis scripts, annotations, and documentation. Use of the unversioned
files facility for storage and management of fleeting, intermediate
files is also valuable. Quoting Chancellor Palpatine, I suggest we
"embrace...a larger view" of unversioned files.

On the other hand, it is still unclear [to me] if Fossil can be used
this way. Once more complete unversioned files functionality is
available, I can answer some of my questions through tests of the
system. The critical points of interest are:

1. Maximum single unversioned file size.
2. Repository performance and integrity as overall size increases.
3. FuseFS performance.
4. Sync performance/reliability/usability.

Once I have a better idea of these various characteristics and
constraints, then it should be possible to estimate where Fossil might
be a reasonable solution and where some other approach is needed.

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Stephan Beal
On Sep 11, 2016 18:18, "Adam Jensen"  wrote:
>
> On 09/11/2016 05:41 AM, Stephan Beal wrote:
> > On Sep 10, 2016 19:31, "Adam Jensen"  > > wrote:
> >> 1. What is the largest size of any single file that can be checked into
> >> a repository?
> >
> > effectively limited by system memory: fossil needs approx. 2-3x the
> > file's size (concurrently in RAM) to create/apply deltas.
>
> Does it seem reasonable to assume that unversioned files will not have
> those (2-3x) memory requirements?

That seems reasonable but i have not yet used that feature nor know the
code (whereas i worked with the diff code a couple years ago).

>
> Also, do you suppose the *initial check-in* [of typical versioned files]
> involves that kind of memory usage (2-3x)?

Not as i recall, no.

Another memory cost comes to mind: the zip command creates zip files
in-memory, and may choke on huge repos/files.

>
> >> 2. How well will the sync command handle large files?
> >
> > it syncs whole blobs at a time, which are normally (for most versions of
> > a file after the first) highly efficient/tiny deltas.
>
> If a blob is ~1GB, will the data transfer mechanism hang in there and
> get the job done? I found this page:
>
> https://www.fossil-scm.org/fossil/doc/trunk/www/sync.wiki

It should always keep going until success or an unrecoverable error.

>
> which might answer my question after I digest it all (probably requiring
> some research (I'm not a Computer Scientist or Software Engineer)).
>
> But there was something, somewhat unrelated, on that page that did stand
> out. It says:
>
> '''
> 5.4 Unversioned File Sync
>
> "Unversioned files" are files held in the repository where only the most
> recent version of the file is kept rather than the entire change
> history. Unversioned files are intended to be used to store ephemeral
> content, such as compiled binaries of the most recent release.
> '''
>
> The phrase "ephemeral content" is a bit disconcerting. It suggests
> values and attitudes towards this data which will probably be reflected
> in the requirements, specification, and implementation of the software.
>
> In the use-case I have in mind, this data would be "immutable content"
> and should be considered precious.

That's not, as i understand it, the intention of unversioned filed.
Anything "important" needs to be checked in (versioned). Unversioned files
are primarily intended for hosting pre-built binaries and such.

- stephan
(Sent from a mobile device, possibly from bed. Please excuse brevity,
typos, and top-posting.)
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Adam Jensen
On 09/11/2016 05:41 AM, Stephan Beal wrote:
> On Sep 10, 2016 19:31, "Adam Jensen"  > wrote:
>> 1. What is the largest size of any single file that can be checked into
>> a repository?
>
> effectively limited by system memory: fossil needs approx. 2-3x the
> file's size (concurrently in RAM) to create/apply deltas.

Does it seem reasonable to assume that unversioned files will not have
those (2-3x) memory requirements?

Also, do you suppose the *initial check-in* [of typical versioned files]
involves that kind of memory usage (2-3x)?

>> 2. How well will the sync command handle large files?
> 
> it syncs whole blobs at a time, which are normally (for most versions of
> a file after the first) highly efficient/tiny deltas.

If a blob is ~1GB, will the data transfer mechanism hang in there and
get the job done? I found this page:

https://www.fossil-scm.org/fossil/doc/trunk/www/sync.wiki

which might answer my question after I digest it all (probably requiring
some research (I'm not a Computer Scientist or Software Engineer)).

But there was something, somewhat unrelated, on that page that did stand
out. It says:

'''
5.4 Unversioned File Sync

"Unversioned files" are files held in the repository where only the most
recent version of the file is kept rather than the entire change
history. Unversioned files are intended to be used to store ephemeral
content, such as compiled binaries of the most recent release.
'''

The phrase "ephemeral content" is a bit disconcerting. It suggests
values and attitudes towards this data which will probably be reflected
in the requirements, specification, and implementation of the software.

In the use-case I have in mind, this data would be "immutable content"
and should be considered precious. The goals would be to avoid
accidental loss and/or corruption. It isn't a low-value, fleeting
scratch-pad that would be thrown away on a regular basis.

Perspective makes a difference in what gets built into a system, and how
it gets built...
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Unversioned files.

2016-09-11 Thread Stephan Beal
On Sep 10, 2016 19:31, "Adam Jensen"  wrote:
> 1. What is the largest size of any single file that can be checked into
> a repository?

effectively limited by system memory: fossil needs approx. 2-3x the file's
size (concurrently in RAM) to create/apply deltas.

> 2. How well will the sync command handle large files?

it syncs whole blobs at a time, which are normally (for most versions of a
file after the first) highly efficient/tiny deltas.

- stephan beal
Written on an embedded device. Please pardon brevity and auto-correction.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users