Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies
Ihor Radchenko writes: > I believe that it is useful for the users to see such issues instead of, > say, failing silently on malformed bibliographies. Applied onto main via 5b45ad083. Best, Ihor
Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies
John Kitchin writes: > I would see if you can cache the result and not do it more than needed; it > can add a performance issue on large files. The results of parsing are already cached. See 7ddc5b57c. With this patch, I'd expect 2x performance degradation on (1) first time Org opens the bibliography in current Emacs session; (2) after bibliography is changed. I believe that it is anyway worth it. Having errors in bibliography can be easily overlooked and can potentially cause annoying issues. Best, Ihor
Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies
I would see if you can cache the result and not do it more than needed; it can add a performance issue on large files. John --- Professor John Kitchin (he/him/his) Doherty Hall A207F Department of Chemical Engineering Carnegie Mellon University Pittsburgh, PA 15213 412-268-7803 @johnkitchin http://kitchingroup.cheme.cmu.edu On Wed, Apr 20, 2022 at 8:39 AM Bruce D'Arcus wrote: > On Wed, Apr 20, 2022 at 8:28 AM Ihor Radchenko wrote: > > > > There is an edge case triggering infinite loop in oc-basic. > > > > It is caused by bibtex-map-entries (used in > > org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1]. > > > > The proposed patch validates the bibtex buffer before processing and > > throws an error if issues are found. `bibtex-validate` also > > conveniently displays a list of errors with clickable links to > > problematic lines. > > > > I believe that it is useful for the users to see such issues instead of, > > say, failing silently on malformed bibliographies. > > > > WDYT? > > I haven't tested it, but this is an excellent idea! > > Bruce > > > Best, > > Ihor > > > > [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036 > > > > * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the > > bibliography before parsing. Display list of issues if any (via > > `bibtex-validate`). > > (org-cite-basic--parse-bibliography): Set buffer file name needed by > > `bibtex-validate`. Empty the cache in case of error. > > --- > > lisp/oc-basic.el | 38 -- > > 1 file changed, 24 insertions(+), 14 deletions(-) > > > > diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el > > index 873986d07..79f7a4844 100644 > > --- a/lisp/oc-basic.el > > +++ b/lisp/oc-basic.el > > @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect) > >(let ((entries (make-hash-table :test #'equal)) > > (bibtex-sort-ignore-string-entries t)) > > (bibtex-set-dialect dialect t) > > +;; Throw an error if bibliography is malformed. > > +(unless (bibtex-validate) > > + (user-error "Malformed bibliography at %S" > > + (or (buffer-file-name) (current-buffer > > (bibtex-map-entries > > (lambda (key &rest _) > > ;; Normalize entries: field names are turned into symbols > > @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography > (&optional info) > > (when (or (org-file-has-changed-p file) > >(not (gethash file > org-cite-basic--file-id-cache))) > >(insert-file-contents file) > > + (setf (buffer-file-name) file) > >(puthash file (org-buffer-hash) > org-cite-basic--file-id-cache)) > > - (let* ((file-id (cons file (gethash file > org-cite-basic--file-id-cache))) > > - (entries > > -(or (cdr (assoc file-id > org-cite-basic--bibliography-cache)) > > -(let ((table > > - (pcase (file-name-extension file) > > - ("json" (org-cite-basic--parse-json)) > > - ("bib" (org-cite-basic--parse-bibtex > 'biblatex)) > > - ("bibtex" > (org-cite-basic--parse-bibtex 'BibTeX)) > > - (ext > > - (user-error "Unknown bibliography > extension: %S" > > - ext) > > - (push (cons file-id table) > org-cite-basic--bibliography-cache) > > - table > > - (push (cons file entries) results) > > +(unwind-protect > > +(condition-case nil > > +(unwind-protect > > + (let* ((file-id (cons file (gethash file > org-cite-basic--file-id-cache))) > > + (entries > > +(or (cdr (assoc file-id > org-cite-basic--bibliography-cache)) > > +(let ((table > > + (pcase (file-name-extension > file) > > + ("json" > (org-cite-basic--parse-json)) > > + ("bib" > (org-cite-basic--parse-bibtex 'biblatex)) > > + ("bibtex" > (org-cite-basic--parse-bibtex 'BibTeX)) > > + (ext > > + (user-error "Unknown > bibliography extension: %S" > > + ext) > > + (push (cons file-id table) > org-cite-basic--bibliography-cache) > > + table > > + (push (cons file entries) results)) > > + (setf
Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies
On Wed, Apr 20, 2022 at 8:28 AM Ihor Radchenko wrote: > > There is an edge case triggering infinite loop in oc-basic. > > It is caused by bibtex-map-entries (used in > org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1]. > > The proposed patch validates the bibtex buffer before processing and > throws an error if issues are found. `bibtex-validate` also > conveniently displays a list of errors with clickable links to > problematic lines. > > I believe that it is useful for the users to see such issues instead of, > say, failing silently on malformed bibliographies. > > WDYT? I haven't tested it, but this is an excellent idea! Bruce > Best, > Ihor > > [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036 > > * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the > bibliography before parsing. Display list of issues if any (via > `bibtex-validate`). > (org-cite-basic--parse-bibliography): Set buffer file name needed by > `bibtex-validate`. Empty the cache in case of error. > --- > lisp/oc-basic.el | 38 -- > 1 file changed, 24 insertions(+), 14 deletions(-) > > diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el > index 873986d07..79f7a4844 100644 > --- a/lisp/oc-basic.el > +++ b/lisp/oc-basic.el > @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect) >(let ((entries (make-hash-table :test #'equal)) > (bibtex-sort-ignore-string-entries t)) > (bibtex-set-dialect dialect t) > +;; Throw an error if bibliography is malformed. > +(unless (bibtex-validate) > + (user-error "Malformed bibliography at %S" > + (or (buffer-file-name) (current-buffer > (bibtex-map-entries > (lambda (key &rest _) > ;; Normalize entries: field names are turned into symbols > @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography (&optional > info) > (when (or (org-file-has-changed-p file) >(not (gethash file org-cite-basic--file-id-cache))) >(insert-file-contents file) > + (setf (buffer-file-name) file) >(puthash file (org-buffer-hash) org-cite-basic--file-id-cache)) > - (let* ((file-id (cons file (gethash file > org-cite-basic--file-id-cache))) > - (entries > -(or (cdr (assoc file-id > org-cite-basic--bibliography-cache)) > -(let ((table > - (pcase (file-name-extension file) > - ("json" (org-cite-basic--parse-json)) > - ("bib" (org-cite-basic--parse-bibtex > 'biblatex)) > - ("bibtex" (org-cite-basic--parse-bibtex > 'BibTeX)) > - (ext > - (user-error "Unknown bibliography > extension: %S" > - ext) > - (push (cons file-id table) > org-cite-basic--bibliography-cache) > - table > - (push (cons file entries) results) > +(unwind-protect > +(condition-case nil > +(unwind-protect > + (let* ((file-id (cons file (gethash file > org-cite-basic--file-id-cache))) > + (entries > +(or (cdr (assoc file-id > org-cite-basic--bibliography-cache)) > +(let ((table > + (pcase (file-name-extension file) > + ("json" > (org-cite-basic--parse-json)) > + ("bib" > (org-cite-basic--parse-bibtex 'biblatex)) > + ("bibtex" > (org-cite-basic--parse-bibtex 'BibTeX)) > + (ext > + (user-error "Unknown > bibliography extension: %S" > + ext) > + (push (cons file-id table) > org-cite-basic--bibliography-cache) > + table > + (push (cons file entries) results)) > + (setf (buffer-file-name) nil)) > + (error (setq org-cite-basic--file-id-cache nil))) >(when info (plist-put info :cite-basic/bibliography results)) >results))) > > -- > 2.35.1 > > > > -- > Ihor Radchenko, > PhD, > Center for Advancing Materials Performance from the Nanoscale (CAMP-nano) > State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong > University, Xi'an, China > Email: yanta...@gmail.com, ihor_radche...@alumni.sutd.edu.sg >
[PATCH] oc-basic: Detect malformed bibtex bibliographies
There is an edge case triggering infinite loop in oc-basic. It is caused by bibtex-map-entries (used in org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1]. The proposed patch validates the bibtex buffer before processing and throws an error if issues are found. `bibtex-validate` also conveniently displays a list of errors with clickable links to problematic lines. I believe that it is useful for the users to see such issues instead of, say, failing silently on malformed bibliographies. WDYT? Best, Ihor [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036 * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the bibliography before parsing. Display list of issues if any (via `bibtex-validate`). (org-cite-basic--parse-bibliography): Set buffer file name needed by `bibtex-validate`. Empty the cache in case of error. --- lisp/oc-basic.el | 38 -- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el index 873986d07..79f7a4844 100644 --- a/lisp/oc-basic.el +++ b/lisp/oc-basic.el @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect) (let ((entries (make-hash-table :test #'equal)) (bibtex-sort-ignore-string-entries t)) (bibtex-set-dialect dialect t) +;; Throw an error if bibliography is malformed. +(unless (bibtex-validate) + (user-error "Malformed bibliography at %S" + (or (buffer-file-name) (current-buffer (bibtex-map-entries (lambda (key &rest _) ;; Normalize entries: field names are turned into symbols @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography (&optional info) (when (or (org-file-has-changed-p file) (not (gethash file org-cite-basic--file-id-cache))) (insert-file-contents file) + (setf (buffer-file-name) file) (puthash file (org-buffer-hash) org-cite-basic--file-id-cache)) - (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache))) - (entries -(or (cdr (assoc file-id org-cite-basic--bibliography-cache)) -(let ((table - (pcase (file-name-extension file) - ("json" (org-cite-basic--parse-json)) - ("bib" (org-cite-basic--parse-bibtex 'biblatex)) - ("bibtex" (org-cite-basic--parse-bibtex 'BibTeX)) - (ext - (user-error "Unknown bibliography extension: %S" - ext) - (push (cons file-id table) org-cite-basic--bibliography-cache) - table - (push (cons file entries) results) +(unwind-protect +(condition-case nil +(unwind-protect + (let* ((file-id (cons file (gethash file org-cite-basic--file-id-cache))) + (entries +(or (cdr (assoc file-id org-cite-basic--bibliography-cache)) +(let ((table + (pcase (file-name-extension file) + ("json" (org-cite-basic--parse-json)) + ("bib" (org-cite-basic--parse-bibtex 'biblatex)) + ("bibtex" (org-cite-basic--parse-bibtex 'BibTeX)) + (ext + (user-error "Unknown bibliography extension: %S" + ext) + (push (cons file-id table) org-cite-basic--bibliography-cache) + table + (push (cons file entries) results)) + (setf (buffer-file-name) nil)) + (error (setq org-cite-basic--file-id-cache nil))) (when info (plist-put info :cite-basic/bibliography results)) results))) -- 2.35.1 -- Ihor Radchenko, PhD, Center for Advancing Materials Performance from the Nanoscale (CAMP-nano) State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an, China Email: yanta...@gmail.com, ihor_radche...@alumni.sutd.edu.sg