Re: Culling org files (notion of Types, many small vs few big files)

2021-03-05 Thread Samuel Wales
i should maybe point out that my focus in op is merely, literally
something like "possibilities for code for helping the user archive or
delete stuff in existing bloated org files".

but i am also ok with tangents like other ideas for speed and less
clutter and organizational tricks and so on.  this mailing list in the
old days used to talk about such things.

like, what's a project, what does next mean, "is gtd for me", "at some
point you have to actually /do/ a task", and will the org system get
bloated and slow for users in the future.


On 3/5/21, Samuel Wales  wrote:
> the closest thing i have to types is merely this
>
> - some files go in org-agenda-files, by basename pattern
> - some to text search extra files, pattern, don't need the tses
> - some to neither, no pattern [blog.org]
> - some get put manually in refile targets
>
> which has no real connection to either ontology [like hvac], or
> types/purposes/statuses [like your project type] as you have.
>
> it is merely what things i need in ts agenda vs. search agenda view.
>
> to me the outline is a forest with the tree problem [i.e. the fact
> that we want graphs not trees] kludged using id links and searching.
> files are major categories.  too many and i get confused what is
> where.  tree structure is by ontology, not types.
>
> i think my agenda views therefore wouldn't be any less cluttered or
> confusing if i had more files or shallower trees.  [assuming i set
> category property which i do.]  so that is why i was confused by your
> comment.
>
> i find that the more complex a system i develop, the more i regret it
> later because i can't just reverse it, i can't do maintenance at
> anything close to a sufficient rate, and i get confused.
>
> the ts comment i made is merely that i do like "* LOG [2021-03-05
> Fri 13:44] hi" usefully.  not sure if relevant.
>
>
> On 3/5/21, TRS-80  wrote:
>> On 2021-03-04 16:11, Samuel Wales wrote:
 tim> naming convention ... to determine what is included
>>>
>>> this is also what i do.  org-agenda-files is just set at startup
>>> according to basename pattern.
>>
>> I find it very interesting that all three of us seem to have
>> independently arrived at some of same conclusions.
>>
>> Originally I did not want to go off on this tangent, but now I wonder
>> how close what I am doing is to what you guys are doing.
>>
>> In my case, I came up with some notion of "Types."  Each Org file MAY
>> use one of a list of defined Types at the beginning of the file name
>> (which is also the first top level headline in the file, starting at
>> position 0), followed by a delimiter (currently ": ").[0]
>>
>> I keep experimenting with my list of Types (I probably have too many),
>> but there are a few that definitely seem useful so far.  For example:
>>
>> - Project: Pretty self-explanitory.
>>
>> - Area: A concept lifted straight from PARA Method ("a sphere of
>>activity with a standard to be maintained over time").[1]
>>
>>- Equipment: A special type of Area: that pertains to a single major
>>  piece of equipment (like a vehicle) or some group of related
>>  equipment (e.g. "small shop equipment" or "home appliances",
>>  etc.).
>>
>> - HowTo: Literally "how to do x" which is great for remembering those
>>obscure command line invocations (or whatever) that you only use 2x
>>per year.  Combined with headline level completing-read search (see
>>below) this becomes very powerful/handy.
>>
>> So then, by default, any of Org files starting with either Area:,
>> Equipment:, or Project: are the only ones that are considered "active"
>> for purposes of agenda and scanned for TODOs (implemented as a simple
>> `directory-files' function and a regexp).  I use my system as a
>> combination of TODO and PIM[2], so this makes a nice logical split
>> where all those PIM "random notes" do not impact the agenda
>> performance whatsoever.
>>
>> I have some other custom agenda functions as well, for things like
>> periodic reviews (in the GTD sense) and others.  Org's Agenda really
>> is essentially just like a database query engine when you get right
>> down to it (except storing in plain text of course).
>>
 trs> [smaller files] My agenda is not cluttered.
>>>
>>> it is not clear to me why more smaller files and shallower trees in
>>> the outline would improve the agenda.  sounds good though.
>>
>> I somewhat addressed this above with Types (which improve
>> performance), but as to your specific point (clutter)...
>>
>> OK, so maybe not /directly/.  But rather the whole system have
>> improved my engagement, by way of no longer feeling lost/overwhelmed
>> as I did with very deep trees in only a few files.  I think it is just
>> easier to reason about some small subset of the whole at one time, as
>> represented in a single file.  In theory, I guess you could accomplish
>> the same by narrowing subtrees or other methods, but for whatever
>> reason separate files seem to appeal more to me than 

Re: Culling org files (notion of Types, many small vs few big files)

2021-03-05 Thread Samuel Wales
the closest thing i have to types is merely this

- some files go in org-agenda-files, by basename pattern
- some to text search extra files, pattern, don't need the tses
- some to neither, no pattern [blog.org]
- some get put manually in refile targets

which has no real connection to either ontology [like hvac], or
types/purposes/statuses [like your project type] as you have.

it is merely what things i need in ts agenda vs. search agenda view.

to me the outline is a forest with the tree problem [i.e. the fact
that we want graphs not trees] kludged using id links and searching.
files are major categories.  too many and i get confused what is
where.  tree structure is by ontology, not types.

i think my agenda views therefore wouldn't be any less cluttered or
confusing if i had more files or shallower trees.  [assuming i set
category property which i do.]  so that is why i was confused by your
comment.

i find that the more complex a system i develop, the more i regret it
later because i can't just reverse it, i can't do maintenance at
anything close to a sufficient rate, and i get confused.

the ts comment i made is merely that i do like "* LOG [2021-03-05
Fri 13:44] hi" usefully.  not sure if relevant.


On 3/5/21, TRS-80  wrote:
> On 2021-03-04 16:11, Samuel Wales wrote:
>>> tim> naming convention ... to determine what is included
>>
>> this is also what i do.  org-agenda-files is just set at startup
>> according to basename pattern.
>
> I find it very interesting that all three of us seem to have
> independently arrived at some of same conclusions.
>
> Originally I did not want to go off on this tangent, but now I wonder
> how close what I am doing is to what you guys are doing.
>
> In my case, I came up with some notion of "Types."  Each Org file MAY
> use one of a list of defined Types at the beginning of the file name
> (which is also the first top level headline in the file, starting at
> position 0), followed by a delimiter (currently ": ").[0]
>
> I keep experimenting with my list of Types (I probably have too many),
> but there are a few that definitely seem useful so far.  For example:
>
> - Project: Pretty self-explanitory.
>
> - Area: A concept lifted straight from PARA Method ("a sphere of
>activity with a standard to be maintained over time").[1]
>
>- Equipment: A special type of Area: that pertains to a single major
>  piece of equipment (like a vehicle) or some group of related
>  equipment (e.g. "small shop equipment" or "home appliances",
>  etc.).
>
> - HowTo: Literally "how to do x" which is great for remembering those
>obscure command line invocations (or whatever) that you only use 2x
>per year.  Combined with headline level completing-read search (see
>below) this becomes very powerful/handy.
>
> So then, by default, any of Org files starting with either Area:,
> Equipment:, or Project: are the only ones that are considered "active"
> for purposes of agenda and scanned for TODOs (implemented as a simple
> `directory-files' function and a regexp).  I use my system as a
> combination of TODO and PIM[2], so this makes a nice logical split
> where all those PIM "random notes" do not impact the agenda
> performance whatsoever.
>
> I have some other custom agenda functions as well, for things like
> periodic reviews (in the GTD sense) and others.  Org's Agenda really
> is essentially just like a database query engine when you get right
> down to it (except storing in plain text of course).
>
>>> trs> [smaller files] My agenda is not cluttered.
>>
>> it is not clear to me why more smaller files and shallower trees in
>> the outline would improve the agenda.  sounds good though.
>
> I somewhat addressed this above with Types (which improve
> performance), but as to your specific point (clutter)...
>
> OK, so maybe not /directly/.  But rather the whole system have
> improved my engagement, by way of no longer feeling lost/overwhelmed
> as I did with very deep trees in only a few files.  I think it is just
> easier to reason about some small subset of the whole at one time, as
> represented in a single file.  In theory, I guess you could accomplish
> the same by narrowing subtrees or other methods, but for whatever
> reason separate files seem to appeal more to me than those other ways
> (probably because they are also faster to navigate, among other
> benefits).  However, to each their own here, I suppose.
>
> I think I was also responding to some specific comment you made about
> time stamps (re: "cluttered").
>
> There is also this whole "inter linking" / "atomicity" thing.  I came
> to Orgmode from TiddlyWiki, and that was the only thing I missed, this
> notion of many small "atomic" nodes, which could then be put back
> together in many different ways (links, tags, etc.) as opposed to a
> (usually single, large) tree which (at least somewhat) imposes a
> particular structure and implicit categorization.  Nowadays the
> Zettelkasten stuff have 

Re: Culling org files (notion of Types, many small vs few big files)

2021-03-05 Thread Tim Cross


TRS-80  writes:

> On 2021-03-04 16:11, Samuel Wales wrote:
> I have some other custom agenda functions as well, for things like
> periodic reviews (in the GTD sense) and others.  Org's Agenda really
> is essentially just like a database query engine when you get right
> down to it (except storing in plain text of course).
>

I thinink I would agree with this. The agenda is a lotmore than just an
'agenda'. I tend to view it as a 'query result', which just displays my
query in different ways. sometimes it is similar to what most people
would think of as an agenda, but often it is 'something' completely
different.
--
Tim Cross



Re: Culling org files (notion of Types, many small vs few big files)

2021-03-05 Thread TRS-80

On 2021-03-04 16:11, Samuel Wales wrote:

tim> naming convention ... to determine what is included


this is also what i do.  org-agenda-files is just set at startup
according to basename pattern.


I find it very interesting that all three of us seem to have
independently arrived at some of same conclusions.

Originally I did not want to go off on this tangent, but now I wonder
how close what I am doing is to what you guys are doing.

In my case, I came up with some notion of "Types."  Each Org file MAY
use one of a list of defined Types at the beginning of the file name
(which is also the first top level headline in the file, starting at
position 0), followed by a delimiter (currently ": ").[0]

I keep experimenting with my list of Types (I probably have too many),
but there are a few that definitely seem useful so far.  For example:

- Project: Pretty self-explanitory.

- Area: A concept lifted straight from PARA Method ("a sphere of
  activity with a standard to be maintained over time").[1]

  - Equipment: A special type of Area: that pertains to a single major
piece of equipment (like a vehicle) or some group of related
equipment (e.g. "small shop equipment" or "home appliances",
etc.).

- HowTo: Literally "how to do x" which is great for remembering those
  obscure command line invocations (or whatever) that you only use 2x
  per year.  Combined with headline level completing-read search (see
  below) this becomes very powerful/handy.

So then, by default, any of Org files starting with either Area:,
Equipment:, or Project: are the only ones that are considered "active"
for purposes of agenda and scanned for TODOs (implemented as a simple
`directory-files' function and a regexp).  I use my system as a
combination of TODO and PIM[2], so this makes a nice logical split
where all those PIM "random notes" do not impact the agenda
performance whatsoever.

I have some other custom agenda functions as well, for things like
periodic reviews (in the GTD sense) and others.  Org's Agenda really
is essentially just like a database query engine when you get right
down to it (except storing in plain text of course).


trs> [smaller files] My agenda is not cluttered.


it is not clear to me why more smaller files and shallower trees in
the outline would improve the agenda.  sounds good though.


I somewhat addressed this above with Types (which improve
performance), but as to your specific point (clutter)...

OK, so maybe not /directly/.  But rather the whole system have
improved my engagement, by way of no longer feeling lost/overwhelmed
as I did with very deep trees in only a few files.  I think it is just
easier to reason about some small subset of the whole at one time, as
represented in a single file.  In theory, I guess you could accomplish
the same by narrowing subtrees or other methods, but for whatever
reason separate files seem to appeal more to me than those other ways
(probably because they are also faster to navigate, among other
benefits).  However, to each their own here, I suppose.

I think I was also responding to some specific comment you made about
time stamps (re: "cluttered").

There is also this whole "inter linking" / "atomicity" thing.  I came
to Orgmode from TiddlyWiki, and that was the only thing I missed, this
notion of many small "atomic" nodes, which could then be put back
together in many different ways (links, tags, etc.) as opposed to a
(usually single, large) tree which (at least somewhat) imposes a
particular structure and implicit categorization.  Nowadays the
Zettelkasten stuff have become quite popular, but it is exactly the
same notion.  Tree knowledge structures are great when many people
must share the same info, for example law codes.  Or reference
manuals.  But in our own PIM, we should be more free to link
information together in whatever way suits our own brains.  And be
able to link it back together in multiple, sometimes differing, ways.
I seem to also recall some discussion of research even supporting the
idea that our brains actually function more like an "interconnected
web" than a "tree" structure (can you tell I am a bit of a PIM geek
and have been interested in this subject for some years now?).  :D

Thinking further, I guess my usage has also become possible by some of
the search and other tools I have built /around/ my directory full of
small files, which obviate some of the reasons for keeping things in
"one or a few big files."

One example is my custom headline search function (which uses grep
under the hood)[3].  It has been very helpful in being able to locate
things.  Now I have a completing-read search over all headlines in all
my files (which will jump to that location upon selection).  I have
found that by carefully constructing headlines that this works "well
enough" for almost all my search needs so far.[4]


On 3/4/21, Tim Cross  wrote:


My use pattern also constantly evolves as my requirements and 
priorities

change. It is and 

Re: Culling org files

2021-03-04 Thread Samuel Wales
trs> [smaller files] My agenda is not cluttered.

it is not clear to me why more smaller files and shallower trees in
the outline would improve the agenda.  sounds good though.

tim> naming convention ... to determine what is included

this is also what i do.  org-agenda-files is just set at startup
according to basename pattern.

On 3/4/21, Tim Cross  wrote:
>
> TRS-80  writes:
>
>> On 2021-03-03 16:59, Samuel Wales wrote:
>> I have come to similar conclusion about "don't let org files get too
>> big."  Besides agenda speed, I think it is just easier to
>> conceptualize things when each file covers only a limited scope, trees
>> are more shallow, etc.
>>
>> So, lately (last year or more), I have been trying a "many small (up
>> to perhaps medium)" instead of "few big" files approach (along with
>> some custom tooling) and it has been working /a lot/ better for me.  I
>> really feel on top of things for the first time in a long time.  My
>> agenda is not cluttered.  I can focus on important things, while not
>> losing track of the rest, etc.
>>
>
> I agree with this. I have a similar approach. I consider the file system
> and org files to be the initial 'structure' and have many smaller files
> rather than a couple of very large ones. Only a subset of files play a
> role in the agenda (I'm still experimenting with two different
> approaches for this - one uses a couple of functions which can
> dynamically change the agenda list and the other uses a naming
> convention which is used as the basis of a search to determine what is
> included in the agenda. Final rsult will likely be a combination).
>
> My use pattern also constantly evolves as my requirements and priorities
> change. It is and probably always will be, a work in progress!
>
> --
> Tim Cross
>
>


-- 
The Kafka Pandemic

Please learn what misopathy is.
https://thekafkapandemic.blogspot.com/2013/10/why-some-diseases-are-wronged.html



Re: Culling org files

2021-03-04 Thread Tim Cross


TRS-80  writes:

> On 2021-03-03 16:59, Samuel Wales wrote:
> I have come to similar conclusion about "don't let org files get too
> big."  Besides agenda speed, I think it is just easier to
> conceptualize things when each file covers only a limited scope, trees
> are more shallow, etc.
>
> So, lately (last year or more), I have been trying a "many small (up
> to perhaps medium)" instead of "few big" files approach (along with
> some custom tooling) and it has been working /a lot/ better for me.  I
> really feel on top of things for the first time in a long time.  My
> agenda is not cluttered.  I can focus on important things, while not
> losing track of the rest, etc.
>

I agree with this. I have a similar approach. I consider the file system
and org files to be the initial 'structure' and have many smaller files
rather than a couple of very large ones. Only a subset of files play a
role in the agenda (I'm still experimenting with two different
approaches for this - one uses a couple of functions which can
dynamically change the agenda list and the other uses a naming
convention which is used as the basis of a search to determine what is
included in the agenda. Final rsult will likely be a combination).

My use pattern also constantly evolves as my requirements and priorities
change. It is and probably always will be, a work in progress!

--
Tim Cross



Re: Culling org files

2021-03-04 Thread TRS-80

On 2021-03-03 16:59, Samuel Wales wrote:

along lines of reducing logbook entries


I guess you must have picked up on my comment in another recent
thread.  :)


i often want to reduce org
files, and i wonder if anybody already had the same desire.

here are some random ideas.  my org files are so
large i might have written this list a few times

  1) list links to duplicate headlines
  2) list links to duplicate body text
  3) list links to duplicate entries
  4) list links to duplicate entries, body text, or
 headlines using fuzzy matching
 - suppose you captured an email slightly differently a
   few times
  5) show in agenda the biggest few tasks so you can go to
 them and reduce them or doneify them
  6) (waves hands) git magic to find old entries that might
 be stale
  7) show in agenda the tasks with biggest logbook drawers
 so you can go to them and reduce them
  8) find similar body text that are in distant subtrees
 that might be candidates for refactoring using org-id
 linking
  9) show in agenda deepest olpath levels
  10) indicate deep, shallow, text-filled, etc. top levels
  11) show in agenda entries with most children
  12) archive logbook drawer entries older than 1 year
  - get rid of drawer if empty
  - put the drawer entries into a logbook drawer in a
new task, with a similar header, that then gets
doneified.  then that gets archived when you archive
stuff.
  13) operate on lines matching a pattern
  - e.g. "* [2021-02-17 Wed 20:35]  whatever" lines
might be insubstantial notes that do not need to
clutter the inactive timestamp display in the agenda
and thus should be moved to a target location with
query
  - that target location would presumably not be in an
agenda file
  14) function to lint all agenda files
  15) reduce false positives in lint

well, idk if htese are good ideas.  just thought maybe we
could form a cult of "don't let org files get too big".


I have come to similar conclusion about "don't let org files get too
big."  Besides agenda speed, I think it is just easier to
conceptualize things when each file covers only a limited scope, trees
are more shallow, etc.

So, lately (last year or more), I have been trying a "many small (up
to perhaps medium)" instead of "few big" files approach (along with
some custom tooling) and it has been working /a lot/ better for me.  I
really feel on top of things for the first time in a long time.  My
agenda is not cluttered.  I can focus on important things, while not
losing track of the rest, etc.

I could write a whole lot about my "custom tooling" but as that is an
entire package in its own right (still in experimental development and
thus unreleased), I will limit my comments here only to the "archival"
portion of this problem.

I realized, at least in my case, after mulling this over for some
time, that there seem to be a few distinct cases which would need to
be handled by a custom archival function:

- If the TODO is still active, and the number of logbook entries
  exceed some (definable) threshold, either move the older entries to
  a similarly named archive file/heading, or (also definable) simply
  delete them.  This would cover things like habits and other
  recurring tasks that tend to generate lots and lots of entries over
  time.  This is perhaps the part I mentioned in the other thread
  recently.

- If the TODO is completed (and perhaps also after it becomes a
  certain (again, definable) age), then move the whole TODO to a
  similarly named archive file.

- There was another, but I think it was for the case where the entire
  file is a project (which is a bit specific to my own setup).

Ideally, this custom function would handle all the above cases, and
could be called with point at each headline, so it would be easy to
map over a file or even a directory full of files, in order to
automate the archival process (perhaps annually?).

Cheers,
TRS-80