Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies

2022-07-09 Thread Ihor Radchenko
Ihor Radchenko  writes:

> I believe that it is useful for the users to see such issues instead of,
> say, failing silently on malformed bibliographies.

Applied onto main via 5b45ad083.

Best,
Ihor



Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies

2022-04-20 Thread Ihor Radchenko
John Kitchin  writes:

> I would see if you can cache the result and not do it more than needed; it
> can add a performance issue on large files.

The results of parsing are already cached. See 7ddc5b57c.

With this patch, I'd expect 2x performance degradation on (1) first time
Org opens the bibliography in current Emacs session; (2) after
bibliography is changed.

I believe that it is anyway worth it. Having errors in bibliography can
be easily overlooked and can potentially cause annoying issues.

Best,
Ihor



Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies

2022-04-20 Thread John Kitchin
I would see if you can cache the result and not do it more than needed; it
can add a performance issue on large files.

John

---
Professor John Kitchin (he/him/his)
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
@johnkitchin
http://kitchingroup.cheme.cmu.edu



On Wed, Apr 20, 2022 at 8:39 AM Bruce D'Arcus  wrote:

> On Wed, Apr 20, 2022 at 8:28 AM Ihor Radchenko  wrote:
> >
> > There is an edge case triggering infinite loop in oc-basic.
> >
> > It is caused by bibtex-map-entries (used in
> > org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1].
> >
> > The proposed patch validates the bibtex buffer before processing and
> > throws an error if issues are found.  `bibtex-validate` also
> > conveniently displays a list of errors with clickable links to
> > problematic lines.
> >
> > I believe that it is useful for the users to see such issues instead of,
> > say, failing silently on malformed bibliographies.
> >
> > WDYT?
>
> I haven't tested it, but this is an excellent idea!
>
> Bruce
>
> > Best,
> > Ihor
> >
> > [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036
> >
> > * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the
> > bibliography before parsing.  Display list of issues if any (via
> > `bibtex-validate`).
> > (org-cite-basic--parse-bibliography): Set buffer file name needed by
> > `bibtex-validate`.  Empty the cache in case of error.
> > ---
> >  lisp/oc-basic.el | 38 --
> >  1 file changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
> > index 873986d07..79f7a4844 100644
> > --- a/lisp/oc-basic.el
> > +++ b/lisp/oc-basic.el
> > @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect)
> >(let ((entries (make-hash-table :test #'equal))
> >  (bibtex-sort-ignore-string-entries t))
> >  (bibtex-set-dialect dialect t)
> > +;; Throw an error if bibliography is malformed.
> > +(unless (bibtex-validate)
> > +  (user-error "Malformed bibliography at %S"
> > +  (or (buffer-file-name) (current-buffer
> >  (bibtex-map-entries
> >   (lambda (key &rest _)
> > ;; Normalize entries: field names are turned into symbols
> > @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography
> (&optional info)
> >  (when (or (org-file-has-changed-p file)
> >(not (gethash file
> org-cite-basic--file-id-cache)))
> >(insert-file-contents file)
> > +  (setf (buffer-file-name) file)
> >(puthash file (org-buffer-hash)
> org-cite-basic--file-id-cache))
> > -   (let* ((file-id (cons file (gethash file
> org-cite-basic--file-id-cache)))
> > -   (entries
> > -(or (cdr (assoc file-id
> org-cite-basic--bibliography-cache))
> > -(let ((table
> > -   (pcase (file-name-extension file)
> > - ("json" (org-cite-basic--parse-json))
> > - ("bib" (org-cite-basic--parse-bibtex
> 'biblatex))
> > - ("bibtex"
> (org-cite-basic--parse-bibtex 'BibTeX))
> > - (ext
> > -  (user-error "Unknown bibliography
> extension: %S"
> > -  ext)
> > -  (push (cons file-id table)
> org-cite-basic--bibliography-cache)
> > -  table
> > -  (push (cons file entries) results)
> > +(unwind-protect
> > +(condition-case nil
> > +(unwind-protect
> > +   (let* ((file-id (cons file (gethash file
> org-cite-basic--file-id-cache)))
> > +   (entries
> > +(or (cdr (assoc file-id
> org-cite-basic--bibliography-cache))
> > +(let ((table
> > +   (pcase (file-name-extension
> file)
> > + ("json"
> (org-cite-basic--parse-json))
> > + ("bib"
> (org-cite-basic--parse-bibtex 'biblatex))
> > + ("bibtex"
> (org-cite-basic--parse-bibtex 'BibTeX))
> > + (ext
> > +  (user-error "Unknown
> bibliography extension: %S"
> > +  ext)
> > +  (push (cons file-id table)
> org-cite-basic--bibliography-cache)
> > +  table
> > +  (push (cons file entries) results))
> > +  (setf

Re: [PATCH] oc-basic: Detect malformed bibtex bibliographies

2022-04-20 Thread Bruce D'Arcus
On Wed, Apr 20, 2022 at 8:28 AM Ihor Radchenko  wrote:
>
> There is an edge case triggering infinite loop in oc-basic.
>
> It is caused by bibtex-map-entries (used in
> org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1].
>
> The proposed patch validates the bibtex buffer before processing and
> throws an error if issues are found.  `bibtex-validate` also
> conveniently displays a list of errors with clickable links to
> problematic lines.
>
> I believe that it is useful for the users to see such issues instead of,
> say, failing silently on malformed bibliographies.
>
> WDYT?

I haven't tested it, but this is an excellent idea!

Bruce

> Best,
> Ihor
>
> [1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036
>
> * lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the
> bibliography before parsing.  Display list of issues if any (via
> `bibtex-validate`).
> (org-cite-basic--parse-bibliography): Set buffer file name needed by
> `bibtex-validate`.  Empty the cache in case of error.
> ---
>  lisp/oc-basic.el | 38 --
>  1 file changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
> index 873986d07..79f7a4844 100644
> --- a/lisp/oc-basic.el
> +++ b/lisp/oc-basic.el
> @@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect)
>(let ((entries (make-hash-table :test #'equal))
>  (bibtex-sort-ignore-string-entries t))
>  (bibtex-set-dialect dialect t)
> +;; Throw an error if bibliography is malformed.
> +(unless (bibtex-validate)
> +  (user-error "Malformed bibliography at %S"
> +  (or (buffer-file-name) (current-buffer
>  (bibtex-map-entries
>   (lambda (key &rest _)
> ;; Normalize entries: field names are turned into symbols
> @@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography (&optional 
> info)
>  (when (or (org-file-has-changed-p file)
>(not (gethash file org-cite-basic--file-id-cache)))
>(insert-file-contents file)
> +  (setf (buffer-file-name) file)
>(puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
> -   (let* ((file-id (cons file (gethash file 
> org-cite-basic--file-id-cache)))
> -   (entries
> -(or (cdr (assoc file-id 
> org-cite-basic--bibliography-cache))
> -(let ((table
> -   (pcase (file-name-extension file)
> - ("json" (org-cite-basic--parse-json))
> - ("bib" (org-cite-basic--parse-bibtex 
> 'biblatex))
> - ("bibtex" (org-cite-basic--parse-bibtex 
> 'BibTeX))
> - (ext
> -  (user-error "Unknown bibliography 
> extension: %S"
> -  ext)
> -  (push (cons file-id table) 
> org-cite-basic--bibliography-cache)
> -  table
> -  (push (cons file entries) results)
> +(unwind-protect
> +(condition-case nil
> +(unwind-protect
> +   (let* ((file-id (cons file (gethash file 
> org-cite-basic--file-id-cache)))
> +   (entries
> +(or (cdr (assoc file-id 
> org-cite-basic--bibliography-cache))
> +(let ((table
> +   (pcase (file-name-extension file)
> + ("json" 
> (org-cite-basic--parse-json))
> + ("bib" 
> (org-cite-basic--parse-bibtex 'biblatex))
> + ("bibtex" 
> (org-cite-basic--parse-bibtex 'BibTeX))
> + (ext
> +  (user-error "Unknown 
> bibliography extension: %S"
> +  ext)
> +  (push (cons file-id table) 
> org-cite-basic--bibliography-cache)
> +  table
> +  (push (cons file entries) results))
> +  (setf (buffer-file-name) nil))
> +  (error (setq org-cite-basic--file-id-cache nil)))
>(when info (plist-put info :cite-basic/bibliography results))
>results)))
>
> --
> 2.35.1
>
>
>
> --
> Ihor Radchenko,
> PhD,
> Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
> State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong 
> University, Xi'an, China
> Email: yanta...@gmail.com, ihor_radche...@alumni.sutd.edu.sg
>



[PATCH] oc-basic: Detect malformed bibtex bibliographies

2022-04-20 Thread Ihor Radchenko
There is an edge case triggering infinite loop in oc-basic.

It is caused by bibtex-map-entries (used in
org-cite-basic--parse-bibtex) when ran on a malformed bibtex buffer [1].

The proposed patch validates the bibtex buffer before processing and
throws an error if issues are found.  `bibtex-validate` also
conveniently displays a list of errors with clickable links to
problematic lines.

I believe that it is useful for the users to see such issues instead of,
say, failing silently on malformed bibliographies.

WDYT?

Best,
Ihor

[1] https://debbugs.gnu.org/cgi/bugreport.cgi?bug=55036

* lisp/oc-basic.el (org-cite-basic--parse-bibtex): Validate the
bibliography before parsing.  Display list of issues if any (via
`bibtex-validate`).
(org-cite-basic--parse-bibliography): Set buffer file name needed by
`bibtex-validate`.  Empty the cache in case of error.
---
 lisp/oc-basic.el | 38 --
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/lisp/oc-basic.el b/lisp/oc-basic.el
index 873986d07..79f7a4844 100644
--- a/lisp/oc-basic.el
+++ b/lisp/oc-basic.el
@@ -214,6 +214,10 @@ (defun org-cite-basic--parse-bibtex (dialect)
   (let ((entries (make-hash-table :test #'equal))
 (bibtex-sort-ignore-string-entries t))
 (bibtex-set-dialect dialect t)
+;; Throw an error if bibliography is malformed.
+(unless (bibtex-validate)
+  (user-error "Malformed bibliography at %S"
+  (or (buffer-file-name) (current-buffer
 (bibtex-map-entries
  (lambda (key &rest _)
;; Normalize entries: field names are turned into symbols
@@ -258,21 +262,27 @@ (defun org-cite-basic--parse-bibliography (&optional info)
 (when (or (org-file-has-changed-p file)
   (not (gethash file org-cite-basic--file-id-cache)))
   (insert-file-contents file)
+  (setf (buffer-file-name) file)
   (puthash file (org-buffer-hash) org-cite-basic--file-id-cache))
-   (let* ((file-id (cons file (gethash file 
org-cite-basic--file-id-cache)))
-   (entries
-(or (cdr (assoc file-id 
org-cite-basic--bibliography-cache))
-(let ((table
-   (pcase (file-name-extension file)
- ("json" (org-cite-basic--parse-json))
- ("bib" (org-cite-basic--parse-bibtex 
'biblatex))
- ("bibtex" (org-cite-basic--parse-bibtex 
'BibTeX))
- (ext
-  (user-error "Unknown bibliography extension: 
%S"
-  ext)
-  (push (cons file-id table) 
org-cite-basic--bibliography-cache)
-  table
-  (push (cons file entries) results)
+(unwind-protect
+(condition-case nil
+(unwind-protect
+   (let* ((file-id (cons file (gethash file 
org-cite-basic--file-id-cache)))
+   (entries
+(or (cdr (assoc file-id 
org-cite-basic--bibliography-cache))
+(let ((table
+   (pcase (file-name-extension file)
+ ("json" 
(org-cite-basic--parse-json))
+ ("bib" 
(org-cite-basic--parse-bibtex 'biblatex))
+ ("bibtex" 
(org-cite-basic--parse-bibtex 'BibTeX))
+ (ext
+  (user-error "Unknown 
bibliography extension: %S"
+  ext)
+  (push (cons file-id table) 
org-cite-basic--bibliography-cache)
+  table
+  (push (cons file entries) results))
+  (setf (buffer-file-name) nil))
+  (error (setq org-cite-basic--file-id-cache nil)))
   (when info (plist-put info :cite-basic/bibliography results))
   results)))
 
-- 
2.35.1



-- 
Ihor Radchenko,
PhD,
Center for Advancing Materials Performance from the Nanoscale (CAMP-nano)
State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong 
University, Xi'an, China
Email: yanta...@gmail.com, ihor_radche...@alumni.sutd.edu.sg