This is an automated notification sent by LCG Savannah.
It relates to:
                task #13644, project CDS Invenio

==============================================================================
 LATEST MODIFICATIONS of task #13644:
==============================================================================

Update of task #13644 (project cdsware):

                  Status:                    None => Done                   
             Open/Closed:                    Open => Closed                 

    _______________________________________________________

Follow-up Comment #3:

Speed issues fixed as follows:

* in bibformat, the only addition is filter_hidden_fields, that is called
only once per record, and only if of=xm (marcxml)
* caching records is done as before
* in search engine, in search_pattern(), acc_authorize_action() is called
only once

http://cdsware.cern.ch/repo/?p=personal/cds-invenio-marko.git;a=shortlog;h=hiddentags

==============================================================================
 OVERVIEW of task #13644:
==============================================================================

URL:
  <http://savannah.cern.ch/task/?13644>

                 Summary: Do not show hidden notes in records (unless
authorized)
                 Project: CDS Invenio
            Submitted by: man
            Submitted on: 2010-02-01 10:55
         Should Start On: 2010-02-01 00:00
   Should be Finished on: 2010-03-01 00:00
                Category: WebAccess
                Priority: 5 - Normal
                  Status: Done
                 Privacy: Public
        Percent Complete: 100%
             Assigned to: man
             Open/Closed: Closed
         Discussion Lock: Any
                  Effort: 20.00

    _______________________________________________________


Records often contain "hidden" tags that contain information that is not
meant for end users.

Example: tags 595 in the Atlantis collection records are technical CERN
notes.

Task: 

Define a new conf variable listing all hidden tags of an Invenio instance,
e.g. CFG_BIBFORMAT_HIDDEN_TAGS = 595

In the MARC and MARCXML output formats, especially when served for user-level
apps (e.g. print_record() and format_record()), filter these variables away,
depending on user_info (authorization: runbibedit)

In the search engine, in the search_pattern() function, before
calling search_unit(), check if people have rights to search this
unit (e.g. `runbibedit' rights), otherwise pretend that search_unit()
returned an empty hitset.


    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: 2010-02-04 13:11              By: Marko Niinimaki <man>
Speed issues fixed as follows:

* in bibformat, the only addition is filter_hidden_fields, that is called
only once per record, and only if of=xm (marcxml)
* caching records is done as before
* in search engine, in search_pattern(), acc_authorize_action() is called
only once

http://cdsware.cern.ch/repo/?p=personal/cds-invenio-marko.git;a=shortlog;h=hiddentags

-------------------------------------------------------
Date: 2010-02-03 15:59              By: Tibor Simko <simko>
I had a quick look at the patch diffs, not testing it at works yet.
Here are some issues we may need to fix:

- The MARC generation is always done on the fly, even if we could take
  it from the DB.  This is very inefficient, because it can be ~70
  times slower to construct MARCXML from bibxxx tables:

    %timeit format_record(1, 'xm')
    1000 loops, best of 3: 412 us per loop

    %timeit format_record(1, 'xm', on_the_fly=True)
    10 loops, best of 3: 36.6 ms per loop

  This is being done also for sites where CFG_BIBFORMAT_HIDDEN_TAGS is
  empty, so can be quite a performance penalty.

  (Moreover, we may stop using bibxxx tables one day, and use MARCXML
  only, see some old musings.)

  So, it would be preferable to fetch full MARCXML from the DB, and to
  filter the hidden fields afterwards, if the user cannot see them.  A
  kind of post-processing of full MARCMXL in XSLT style in order to
  remove hidden fields.  In this way the internal accesses to MARCXML
  (e.g. BibEdit) would be ultra fast, as before.  And your wrapper
  would be nicely separated from the previous code, so that we could
  possibly do things like:

    filter_marcxml(format_record(1, 'xm'), hide_tags=['595', '933'])
    filter_marc(format_record(1, 'hm'), hide_tags=['333'])

- Another performance consideration: inside BibFormatObject
  constructor, acc_authorize_action() is called always, but this is
  only necessary to do for MARC and MARCXML output formats.  So in the
  vast majority of cases (e.g. a lambda user displaying 25 hits per
  page in HTML brief format) we would call this thing unnecessarily.
  Can you please move the hidden fields checking down to the MARC
  output format branch, or at least initialize
  self.can_see_hidden_fields variable only when self.format is of the
  MARC type?

- Another performance consideration: in the search engine, in
  search_pattern(), acc_authorize_action() is called many times, even
  if it is not needed.  Would be better to call it only once, and even
  better only if some bsu_f starts with two numbers (which is rarely
  the case).  Moreover, we should probably activate the check for
  hidden tag searching only in cases when users come via Web interface
  (detected by the req context), so that CLI searches would be always
  fast.

- BTW the patch removes the line ``p_tag = parse_tag(tag)'' from
  bibformat_engine.py, leading to an undefined p_tag variable.


-------------------------------------------------------
Date: 2010-02-03 13:25              By: Marko Niinimaki <man>
Functionality implemented in

http://cdsware.cern.ch/repo/?p=personal/cds-invenio-marko.git;a=shortlog;h=hiddentags






    _______________________________________________________

Carbon-Copy List:

CC Address                          | Comment
------------------------------------+-----------------------------
1576                                | -COM-
3346                                | -SUB-




==============================================================================

This item URL is:
  <http://savannah.cern.ch/task/?13644>

_______________________________________________
  Message sent via/by LCG Savannah
  http://savannah.cern.ch/

Reply via email to