Re: [O] Use headings in sitemap

2016-11-02 Thread Nicolas Goaziou
Hello,

Thibault Marin  writes:

> The latest update works for me (all my previously reported issues are
> fixed).  I have also tested anti-chronologically sorting, which works
> too.

Great.

> Please let me know if you'd like me to run additional tests.

I wrote and pushed a full test suite for site-map generation.

Meanwhile, I discovered we couldn't obey to :sitemap-sans-extension
anymore, so I removed the property and suggested an equivalent,
using :sitemap-format-entry, in ORG-NEWS.

Another optimization would be to call :sitemap-format-entry with two
arguments instead of three, the /relative/ file name and the site-map
style (instead of absolute file name, root directory and style), but
this change would require to change keys in cache, which would
invalidate any existing cache.

I'm not sure this change is worth the effort.

Regards,

-- 
Nicolas Goaziou



Re: [O] Use headings in sitemap

2016-10-31 Thread Thibault Marin

> I think this is a genuine bug. Exclude regexp should be matched against
> relative file names, not absolute ones. I fixed it in wip-sitemap. You
> may want to rebase the branch if you want to experiment with the fix.

The latest update works for me (all my previously reported issues are
fixed).  I have also tested anti-chronologically sorting, which works
too.

Please let me know if you'd like me to run additional tests.

Thanks.




Re: [O] Use headings in sitemap

2016-10-31 Thread Nicolas Goaziou
Thibault Marin  writes:

> Sorry for the confusion, I don't think anything is wrong with the new
> ox-publish.el, but the selection of excluded files by regexp seems to
> have changed (I personally have no problem with this change, I just
> thought I'd mention it).
>
> My directory structure has a "website/org/" component so the loose
> regexp "website.org" used for exclusion matched any file path (it
> appears that full paths are used to determine which files should be
> excluded), so, for instance, "/path/to/website/org/index.org" was
> excluded, which I do not want.

I think this is a genuine bug. Exclude regexp should be matched against
relative file names, not absolute ones. I fixed it in wip-sitemap. You
may want to rebase the branch if you want to experiment with the fix.

Thank you for the explanation.

Regards,



Re: [O] Use headings in sitemap

2016-10-31 Thread Thibault Marin

> I'm not sure to understand. Why resulting in an empty file list is
> a problem? Is there an error in the new "ox-publish.el"?

Sorry for the confusion, I don't think anything is wrong with the new
ox-publish.el, but the selection of excluded files by regexp seems to
have changed (I personally have no problem with this change, I just
thought I'd mention it).

My directory structure has a "website/org/" component so the loose
regexp "website.org" used for exclusion matched any file path (it
appears that full paths are used to determine which files should be
excluded), so, for instance, "/path/to/website/org/index.org" was
excluded, which I do not want.

The same loose regexp did not exclude "index.org" in the previous
version of the sitemap functions. Maybe the exclude regexp was applied
to file names relative to root ("index.org" in my example)?

With a more restrictive regexp "website\\.org", everything behaves as
expected.

I hope it is clearer.

Thanks.



Re: [O] Use headings in sitemap

2016-10-31 Thread Nicolas Goaziou
Hello,

Thibault Marin  writes:

> I don't have the `directory-name-p' function (I am still on emacs 24),
> so I made a simplistic one: (string= file (file-name-sans-extension
> file)), it seems to be sufficient for my test-case.  I don't know if not
> being on 25 will cause other issues.

Fixed.

> I also had to add a call to `expand-file-name' around the definition of
> the `root' variable (in `org-publish-sitemap') to account for the fact
> that my :base-directory is defined with "~/" instead of "/home/...".

Fixed, too.

> Another thing I had to modify was the :exclude pattern which was
> mis-formed earlier ("setup.org\\|website.org\\|rss.org" changed to
> "setup\\.org\\|website\\.org\\|rss\\.org").  The earlier version of the
> pattern results in an empty file list but was not a problem on the older
> version of the sitemap tools.

I'm not sure to understand. Why resulting in an empty file list is
a problem? Is there an error in the new "ox-publish.el"?

> (let ((date
>(org-element-interpret-data
> (org-publish-find-property entry :date

There is also `org-publish-find-date', which is sligthly different.

Thanks for the feedback.

Regards,

-- 
Nicolas Goaziou



Re: [O] Use headings in sitemap

2016-10-30 Thread Thibault Marin
Nicolas Goaziou writes:

> I pushed an implementation of that idea in wip-sitemap branch, if anyone
> wants to test it.

Thanks!

> For example, setting :sitemap-function property to
>
>(lambda (title list)
>  (concat "#+TITLE: " title "\n\n"
>  (org-list-to-subtree list)))
>
> mostly achieves what the OP wants.

I don't have the `directory-name-p' function (I am still on emacs 24),
so I made a simplistic one: (string= file (file-name-sans-extension
file)), it seems to be sufficient for my test-case.  I don't know if not
being on 25 will cause other issues.

I also had to add a call to `expand-file-name' around the definition of
the `root' variable (in `org-publish-sitemap') to account for the fact
that my :base-directory is defined with "~/" instead of "/home/...".

Another thing I had to modify was the :exclude pattern which was
mis-formed earlier ("setup.org\\|website.org\\|rss.org" changed to
"setup\\.org\\|website\\.org\\|rss\\.org").  The earlier version of the
pattern results in an empty file list but was not a problem on the older
version of the sitemap tools.  Anyway, I have now fixed my setup.

> Also, setting :sitemap-format-entry
> to
>
>(lambda (entry root style)
>  (if (directory-name-p entry)
>  (file-name-nondirectory (directory-file-name entry))
>(format
> "[[file:%s][%s]]%s"
> (file-relative-name entry root)
> (org-publish-find-title entry)
> (let ((subtitle
>(org-element-interpret-data
> (org-publish-find-property entry :subtitle 'latex
>   (if (equal subtitle "") "" (format " (%s)" subtitle))

This is perfect for me, thanks.  I wanted to display the date along with
the title for all the pages in the posts heading so I used the following
(I should be able to filter the folder name better than this, this was
just to test things out).

   (lambda (entry root style)
 (if (directory-name-p entry)
 (file-name-nondirectory (directory-file-name entry))
   (format
"[[file:%s][%s%s]]"
(file-relative-name entry root)
(let ((date
   (org-element-interpret-data
(org-publish-find-property entry :date
  (if (or (equal date "")
  (not (string-match
"posts/"
(file-relative-name entry root
  "" (format "(%s) "
 (replace-regexp-in-string
  "[<>]" ""
  date
(org-publish-find-title entry

> Feedback weclome.

>From my limited use, this perfectly fits my needs.  The only thing I
have not fully tested yet is the sorting mechanism, I'll try that soon.

Thanks,
thibault



Re: [O] Use headings in sitemap

2016-10-30 Thread Nicolas Goaziou
Nicolas Goaziou  writes:

> Hello,
>
> Rasmus Pank Roulund  writes:
>
>> Nicolas Goaziou  writes:
>
>> It’s not quite that complicated in my patch/WIP.  You specify an ordering
>> function.  E.g. the plain list is:
>>
>>  (defun org-publish-org-sitemap-as-list (files project-plist)
>>"Insert FILES as simple list separated by newlines.
>>  PROJECT-PLIST holds the project information."
>>(mapconcat
>> (lambda (file) (org-publish-format-file-entry
>>org-publish-sitemap-file-entry-format
>>file project-plist))
>> files "\n"))
>>
>> If you don’t have the full flexibility of a function I guess someone will
>> always run into trouble eventually...
>
> I think one mistake here is to conflate style and formatting. By doing
> so, defining a new style implies that one has to handle sorting,
> directories (or lack thereof)... and also Org syntax.
>
> I suggest to keep style as a mean to control how the file names are
> provided, and separate it from the formatting process, handled
> by :sitemap-function and :sitemap-format-entry or some such.
>
> We might, however, by this definition, merge sorting and style together
> (e.g., tree-date-ascending list-name-descending).
>
>>> I suggest to let :sitemap-function operate on the lists of files
>>> included in the sitemap (i.e., the list of files in the project),
>>> already ordered, and formatted according to
>>> `org-publish-sitemap-file-entry-format'.
>>
>> Isn’t that’s what my patch does?
>
> More or less, but my proposal is slightly different. E.g., I suggest
> a different data type for the arguments.
>
> OTOH, your patch does other things orthogonal to my proposal (e.g.
> preamble and postambles for sitemaps...).
>
>> I like that, but AFAIK the backend is not known at the time the sitemap is
>> generated.  And it might not be deducible from the publishing
>> function.
>
> You might have misread my proposal. 
>
> I'm suggesting to leave it up to the user. Whenever they define a new
> sitemap function and need to implement a formatting function, they can
> provide the name of the back-end they want to use. This information is
> known to the user.
>
> Conversely, we do not provide any ready-to-use keyword (so, no format
> string with placeholders) because, as you write, we cannot predict the
> back-end with certainty. Instead, we merely implement a generic getter
> function (which you mostly implemented in your patch set).

I pushed an implementation of that idea in wip-sitemap branch, if anyone
wants to test it.

For example, setting :sitemap-function property to

   (lambda (title list)
 (concat "#+TITLE: " title "\n\n"
 (org-list-to-subtree list)))

mostly achieves what the OP wants. Also, setting :sitemap-format-entry
to

   (lambda (entry root style)
 (if (directory-name-p entry)
 (file-name-nondirectory (directory-file-name entry))
   (format
"[[file:%s][%s]]%s"
(file-relative-name entry root)
(org-publish-find-title entry)
(let ((subtitle
   (org-element-interpret-data
(org-publish-find-property entry :subtitle 'latex
  (if (equal subtitle "") "" (format " (%s)" subtitle))

will add a subtitle to the entry, when available, upon publishing to
LaTeX.

Feedback weclome.


Regards,



Re: [O] Use headings in sitemap

2016-10-12 Thread Nicolas Goaziou
Hello,

Rasmus Pank Roulund  writes:

> Nicolas Goaziou  writes:

> It’s not quite that complicated in my patch/WIP.  You specify an ordering
> function.  E.g. the plain list is:
>
>  (defun org-publish-org-sitemap-as-list (files project-plist)
>"Insert FILES as simple list separated by newlines.
>  PROJECT-PLIST holds the project information."
>(mapconcat
> (lambda (file) (org-publish-format-file-entry
>org-publish-sitemap-file-entry-format
>file project-plist))
> files "\n"))
>
> If you don’t have the full flexibility of a function I guess someone will
> always run into trouble eventually...

I think one mistake here is to conflate style and formatting. By doing
so, defining a new style implies that one has to handle sorting,
directories (or lack thereof)... and also Org syntax.

I suggest to keep style as a mean to control how the file names are
provided, and separate it from the formatting process, handled
by :sitemap-function and :sitemap-format-entry or some such.

We might, however, by this definition, merge sorting and style together
(e.g., tree-date-ascending list-name-descending).

>> I suggest to let :sitemap-function operate on the lists of files
>> included in the sitemap (i.e., the list of files in the project),
>> already ordered, and formatted according to
>> `org-publish-sitemap-file-entry-format'.
>
> Isn’t that’s what my patch does?

More or less, but my proposal is slightly different. E.g., I suggest
a different data type for the arguments.

OTOH, your patch does other things orthogonal to my proposal (e.g.
preamble and postambles for sitemaps...).

> I like that, but AFAIK the backend is not known at the time the sitemap is
> generated.  And it might not be deducible from the publishing
> function.

You might have misread my proposal. 

I'm suggesting to leave it up to the user. Whenever they define a new
sitemap function and need to implement a formatting function, they can
provide the name of the back-end they want to use. This information is
known to the user.

Conversely, we do not provide any ready-to-use keyword (so, no format
string with placeholders) because, as you write, we cannot predict the
back-end with certainty. Instead, we merely implement a generic getter
function (which you mostly implemented in your patch set).

Regards,

-- 
Nicolas Goaziou



Re: [O] Use headings in sitemap

2016-10-12 Thread Rasmus Pank Roulund
Nicolas Goaziou  writes:

> Hello,
>
> Thibault Marin  writes:
>
>> I would like to generate a sitemap for a published website and use
>> it to extract
>> the last few entries in a specific folder to put on the main page.
>>
>> The site structure looks like:
>> .
>> ├── index.org
>> ├── posts
>> │   ├── A.org
>> │   ├── B.org
>> │   └── C.org
>> ├── misc
>> │   ├── page.org
>> │   └── other-page.org
>> └── sitemap.org
>>
>> In index.org, I would have:
>>
>> #+begin_src org
>> #+INCLUDE: sitemap.org::*posts :lines "-10" :only-contents t
>> #+end_src
>>
>> to include links to the 10 most recent pages in =posts= (I use
>> :sitemap-sort-files anti-chronologically in the project setup).  If I am not
>> missing anything, this requires the sitemap.org file to have a
>> =posts= heading,
>> but the `org-publish-org-sitemap' function only produces a list of pages.
>>
>> If there is no better way to get this to work, I would like to propose a 
>> patch
>> to `org-publish-org-sitemap' to produce headings in the sitemap file
>> when a new
>> parameter is passed and non-nil.  The attached patch is my first
>> attempt at it,
>> it works for my tests.
>>
>> I would be interested to hear people's opinion on this:
>> - Is there a better way to achieve what I want?
>> - Is the proposed patch acceptable?  Any comments would be appreciated.
>
> This reminds me of a patch Rasmus (Cc'ed) is working on (thread starting
> at: ).

This is still WIP.  I guess we were discussing the "hows" in that thread
as well.

> I'd like to propose here a slightly different, hopefully simpler
> approach so as to get flexibility without entering keyword hell.
>
> The first thing to note is that :sitemap-function is, IMO, unusable,
> because it puts too much work on the hands of the user. Indeed, they
> have to generate the title of the sitemap page, get the list of files in
> the project, walk that list, handle sorting according to style...

It’s not quite that complicated in my patch/WIP.  You specify an ordering
function.  E.g. the plain list is:

 (defun org-publish-org-sitemap-as-list (files project-plist)
   "Insert FILES as simple list separated by newlines.
 PROJECT-PLIST holds the project information."
   (mapconcat
(lambda (file) (org-publish-format-file-entry
   org-publish-sitemap-file-entry-format
   file project-plist))
files "\n"))

If you don’t have the full flexibility of a function I guess someone will
always run into trouble eventually...

> I suggest to let :sitemap-function operate on the lists of files
> included in the sitemap (i.e., the list of files in the project),
> already ordered, and formatted according to
> `org-publish-sitemap-file-entry-format'.

Isn’t that’s what my patch does?  The file sorting function call the
formater, providing these arguments.  We could move the formatting back in
the "main" sitemap publishing function, to hide it from users, if that’s
better.

(format-spec
 fmt
 `((?t . ,(and (not (directory-name-p file)) (org-publish-find-title 
file t)))
   (?s . ,(and (not (directory-name-p file)) (org-publish-find-subtitle 
file t)))
   (?f . ,filename)
   (?F . ,(directory-file-name
(if (directory-name-p filename)
(file-relative-name
 dirname (org-publish--dir-parent dirname))
  (file-relative-name filename dirname
   (?l . ,link)
   (?h . ,(concat (make-string depth ?*)))
   (?i . ,(concat (make-string (* 2 depth) ? ) "-"))
   (?d . ,(and (not (directory-name-p file))
   (format-time-string
(or (plist-get project-plist :sitemap-date-format)
org-publish-sitemap-date-format)
(org-publish-find-date file


> The list would be provided in the same format as the return value from
> `org-list-to-lisp', so that, e.g., `org-list-to-subtree' can be directly
> called on it.

> Also, I suggest to make `org-publish-sitemap-file-entry-format'
> a function instead of a string, so as to get more power, i.e., to not
> limit ourselves to the list of placeholders allowed in the format
> string. In particular, we could provide a public function
> org-publish-get-keyword (file keyword  backend), much like what
> Rasmus does in his patchset, but with a back-end so as to get the value
> of any export keyword. Also, this would make
> `org-publish-sitemap-dir-entry-format' unnecessary.

I like that, but AFAIK the backend is not known at the time the sitemap is
generated.  And it might not be deducible from the publishing function.

> Eventually, we could run a hook at the end of `org-publish-org-sitemap',
> which would now always be called, in order to give the opportunity to
> modify the 

Re: [O] Use headings in sitemap

2016-10-11 Thread Thibault Marin

Nicolas Goaziou writes:
> This reminds me of a patch Rasmus (Cc'ed) is working on (thread starting
> at: ).
I missed that for some reason, it is better and more ambitious.

> I suggest to let :sitemap-function operate on the lists of files
> included in the sitemap (i.e., the list of files in the project),
> already ordered, and formatted according to
> `org-publish-sitemap-file-entry-format'.
>
> The list would be provided in the same format as the return value from
> `org-list-to-lisp', so that, e.g., `org-list-to-subtree' can be directly
> called on it.
That sounds good to me.

> Also, I suggest to make `org-publish-sitemap-file-entry-format'
> a function instead of a string, so as to get more power, i.e., to not
> limit ourselves to the list of placeholders allowed in the format
> string. In particular, we could provide a public function
> org-publish-get-keyword (file keyword  backend), much like what
> Rasmus does in his patchset, but with a back-end so as to get the value
> of any export keyword. Also, this would make
> `org-publish-sitemap-dir-entry-format' unnecessary.
>
> Eventually, we could run a hook at the end of `org-publish-org-sitemap',
> which would now always be called, in order to give the opportunity to
> modify the sitemap as a whole (e.g., the title).
>
> In a nutshell, ISTM that it would solve both your request and the
> difficulties encountered by Rasmus in changes.
>
> WDYT?
I think it would definitely address my needs and clearly improve the
overall process.  I'll need some time to digest this as I am not too
familiar with the process, but please let me know how I can help with
this (implementation and testing).

Thanks.
thibault



Re: [O] Use headings in sitemap

2016-10-11 Thread Nicolas Goaziou
Hello,

Thibault Marin  writes:

> I would like to generate a sitemap for a published website and use it to 
> extract
> the last few entries in a specific folder to put on the main page.
>
> The site structure looks like:
> .
> ├── index.org
> ├── posts
> │   ├── A.org
> │   ├── B.org
> │   └── C.org
> ├── misc
> │   ├── page.org
> │   └── other-page.org
> └── sitemap.org
>
> In index.org, I would have:
>
> #+begin_src org
> #+INCLUDE: sitemap.org::*posts :lines "-10" :only-contents t
> #+end_src
>
> to include links to the 10 most recent pages in =posts= (I use
> :sitemap-sort-files anti-chronologically in the project setup).  If I am not
> missing anything, this requires the sitemap.org file to have a =posts= 
> heading,
> but the `org-publish-org-sitemap' function only produces a list of pages.
>
> If there is no better way to get this to work, I would like to propose a patch
> to `org-publish-org-sitemap' to produce headings in the sitemap file when a 
> new
> parameter is passed and non-nil.  The attached patch is my first attempt at 
> it,
> it works for my tests.
>
> I would be interested to hear people's opinion on this:
> - Is there a better way to achieve what I want?
> - Is the proposed patch acceptable?  Any comments would be appreciated.

This reminds me of a patch Rasmus (Cc'ed) is working on (thread starting
at: ).

I'd like to propose here a slightly different, hopefully simpler
approach so as to get flexibility without entering keyword hell.

The first thing to note is that :sitemap-function is, IMO, unusable,
because it puts too much work on the hands of the user. Indeed, they
have to generate the title of the sitemap page, get the list of files in
the project, walk that list, handle sorting according to style...

I suggest to let :sitemap-function operate on the lists of files
included in the sitemap (i.e., the list of files in the project),
already ordered, and formatted according to
`org-publish-sitemap-file-entry-format'.

The list would be provided in the same format as the return value from
`org-list-to-lisp', so that, e.g., `org-list-to-subtree' can be directly
called on it.

Also, I suggest to make `org-publish-sitemap-file-entry-format'
a function instead of a string, so as to get more power, i.e., to not
limit ourselves to the list of placeholders allowed in the format
string. In particular, we could provide a public function
org-publish-get-keyword (file keyword  backend), much like what
Rasmus does in his patchset, but with a back-end so as to get the value
of any export keyword. Also, this would make
`org-publish-sitemap-dir-entry-format' unnecessary.

Eventually, we could run a hook at the end of `org-publish-org-sitemap',
which would now always be called, in order to give the opportunity to
modify the sitemap as a whole (e.g., the title).

In a nutshell, ISTM that it would solve both your request and the
difficulties encountered by Rasmus in changes.

WDYT?


Regards,

-- 
Nicolas Goaziou