Could you create a wiki page afterwards?
Sure, I'll put up the wiki page as soon as I am done.

If you're doing a comparison, could you also include AIPs?
Not a problem. I've taken note of this. Are you interested in specific
metrics? I ask because my interest in these benchmarks is purely
response times relative to scale.

Lighton Phiri
http://lightonphiri.org

On 29/01/2013 13:26, helix84 wrote:

Hi Lighton,

I don't know of any such work done as of yet. I definitely would be
interested in your results/findings and surely others would be, too.
Could you create a wiki page afterwards?

On Tue, Jan 29, 2013 at 12:09 PM, Lighton Phiri <[email protected]> wrote:

1. I am guessing the Batch Metadata Editing [1] tool is appropriate in
this context as opposed to using Simple Archive Format [2]; Yes?

I think that's up to the implementation of BME. Theoretically, it
should be faster because it deals with less file opening. I think
determining whether this is true will be one of your results.

I'm not sure if it is, but the CSV implementation should be streaming,
i.e. not requiring to keep the whole CSV in memory at once, processing
it line-by-line instead. That will have the largest impact on memory
usage.

2. Are experiments conducted to come up with the recommended line limit
size of 1k specified here [2] documented anywhere and are there factors
I should consider that could have a baring on this limit? In other
workds, is the 1k size a standard of some sort?

The limit is there only for web UIs, not the command line. The reason
is you don't want to upload a huge CSV file using a browser (and be
frustrated when it fails when you lose connection in the middle). Also
UIs display a diff of changes, giving you a chance to review and
confirm/cancel them. This could produce a huge HTML page which is
clearly not desirable. Command line doesn't have any such limitations
apart from practical ones, like memory consumed.

3. Are there any other documented scalability performance benchmarks
other than this [3]?

Not that I know of, I'd just use Google as you did.

I'd also appreciate any other comments/suggestions.

If you're doing a comparison, could you also include AIPs?


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_jan
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Reply via email to