Re: [CODE4LIB] Reference string parsing software available: ParsCit v080402

2008-11-14 Thread MJ Suhonos
Hi all, John, the supplemented approach you describe is how we go about it in our Lemon8-XML (L8X) software (http://pkp.sfu.ca/lemon8); The way L8X handles parsing is it passes the original unparsed string to a number of different parsers in turn (Freecite, each of the 3 Paracite

Re: [CODE4LIB] Reference string parsing software available: ParsCit v080402

2008-11-17 Thread MJ Suhonos
complex topic that I'm hoping to make the subject of a submission to the Code4Lib journal. :-) MJ MJ Suhonos [EMAIL PROTECTED] 11/14/08 3:18 PM Hi all, John, the supplemented approach you describe is how we go about it in our Lemon8-XML (L8X) software (http://pkp.sfu.ca/lemon8); The way L8X

Re: [CODE4LIB] fedora-commons based article repository/publishing

2009-01-14 Thread MJ Suhonos
Hi Phil, I was just at a PKP workshop in Sydney in December, and the same developers from the Australian National University who developed the OJS METS export plugin unveiled a SWORD 1.2 deposit plugin that works with both Fedora and DSpace:

Re: [CODE4LIB] Assigning DOI for local content

2009-11-23 Thread MJ Suhonos
Hi all, couldn't resist jumping in on this one: But appears that the handle system is quite a bit more fleshed out than a simple purl server, it's a distributed protocol-independent network. The protocol-independent part may or may not be useful, but it certainly seems like it could be,

Re: [CODE4LIB] yaoss4ll

2009-12-21 Thread MJ Suhonos
I would definitely nominate the Qubit Toolkit and the PKP software suite as candidates for this list: http://qubit-toolkit.org/ http://pkp.sfu.ca/ Qubit is somewhat nascent, but is actively being developed and is fairly well-supported (by the ICA, UNESCO, LAC, among others), and the PKP suite

Re: [CODE4LIB] Online PHP course?

2010-01-06 Thread MJ Suhonos
I think that the single critical question to ask about any development in a digital library environment is it's ability to deal with Unicode and it's related standards such as UTF-8. Last time I looked at it, PHP had problems is that area. These problems will bedevil anything you write

Re: [CODE4LIB] Location of the first Code4Lib North meeting?

2010-01-20 Thread MJ Suhonos
in from elsewhere. MJ who also loves Kingston in the spring (but more in the summer when CORK is on) On 2010-01-20, at 10:32 AM, Walter Lewis wrote: On 20 Jan 10, at 10:16 AM, MJ Suhonos wrote: I think mode of transportation is something to consider; for those of us in South/Eastern Ontario

Re: [CODE4LIB] Kingston? And now the date (was Re: [CODE4LIB] Location of the first Code4Lib North meeting?)

2010-01-27 Thread MJ Suhonos
+1 Thursday-Friday 6-7 May here as well. MJ On 2010-01-27, at 10:50 PM, William Denton wrote: I went through all the mail about this and counted a + for each of the top two choices people made (if they made two; otherwise just one + for their single vote). The results: Kingston

Re: [CODE4LIB] Code4Lib 2011 Proposals

2010-03-02 Thread MJ Suhonos
Yes, a group of us at the University of British Columbia and Simon Fraser University in sushi-ski-beach-beer-MichaelBuble-soaked Vancouver, BC are intending on submitting a proposal to host. More specifically, I wonder what thoughts people have about how a VanC4L2011 might affect / be

Re: [CODE4LIB] Code4Lib 2011 Proposals

2010-03-02 Thread MJ Suhonos
More specifically, I wonder what thoughts people have about how a VanC4L2011 might affect / be affected by the C4L North proposal, and Eric's comment that C4L was originally envisioned as an Access USA. There seems to be a strong contingent on both sides of the 49th parallel these days.

Re: [CODE4LIB] PHP bashing (was: newbie)

2010-03-25 Thread MJ Suhonos
Contemporary library web development: a Series of Hoses. http://en.wikipedia.org/wiki/Series_of_tubes MJ On 2010-03-25, at 11:00 AM, Joe Hourcle wrote: On Thu, 25 Mar 2010, Brian Stamper wrote: On Wed, 24 Mar 2010 17:51:38 -0400, Mark Tomko mark.to...@simmons.edu wrote: I wouldn't

Re: [CODE4LIB] PHP bashing (was: newbie)

2010-03-25 Thread MJ Suhonos
Also...it's pretty good for plugging leaks in ducts. Actually, true story: I was in the hardware store, poking around the tape section, with a roll of your typical silver duct tape in my hand, obviously browsing. An employee came up to me asking what I was looking for, and for what purpose.

Re: [CODE4LIB] Code4Lib North planning continues

2010-04-07 Thread MJ Suhonos
So far there are just three people with ideas for talks (me, Walter Lewis, Art Rhyno). I added mytpl.ca (alluringly entitled Location-aware Mobile Search). I figure it could be a good trailer for the forthcoming journal article. ;-) MJ

Re: [CODE4LIB] Twitter annotations and library software

2010-04-28 Thread MJ Suhonos
- there is a JavaScript CSL-Processor. JavaScript is kind of a punishment but it is the natural environment for the Web 2.0 Mashup crowd that is going to implement applications that use Twitter annotations A quick word of caution here; we got excited about citeproc-js until learning that it

[CODE4LIB] MODS and DCTERMS

2010-04-28 Thread MJ Suhonos
Hi all, I'm digging into earlier threads on Code4Lib and NGC4lib and trying to get some concrete examples around the DCTERMS element set — maybe I haven't been a subscriber for long enough. What I'm looking for in particular are things I can work with *in code/implementation*, most notably:

Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
Okay, I know it's cool to hate on OpenURL, but I feel I have to clarify a few points: OpenURL is of no use if you seperate it from the existing infrastructure which is mainly held by companies. No sane person will try to build an open alternative infrastructure because OpenURL is a crapy

Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
It's not that it's cool to hate on OpenURL, but if you've really worked with it it's easy to grow bitter. Well, fair enough. Perhaps what I'm defending isn't OpenURL per se, but rather the concept of being able to transport descriptive assertions the way the 1.0 spec proposes. The reason

Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread MJ Suhonos
Let me correct myself (for the detail-oriented among us): Actually the difference between OpenURL and DC is that one is a transport protocol and one is a metadata schema. :-) OpenURL is a *serialization format* which happens to be actionable by a transport protocol (HTTP), which is its main

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread MJ Suhonos
What I hope for is that OpenURL 1.0 eventually takes a place alongside SGML as a too-complex standard that directly paves the way for a universally adopted foundational technology like XML. What I fear is that it takes a place alongside MARC as an anachronistic standard that paralyzes an

Re: [CODE4LIB] MODS and DCTERMS

2010-05-03 Thread MJ Suhonos
dcterms so so terribly lossy that it would be a shame to reduce MARC to it. This is *precisely* the other half of my rationale — a shame? Why? If MARC is the mind prison that some purport it to be, then let's see what a system built devoid of MARC, but based on the best alternative we have

Re: [CODE4LIB] MODS and DCTERMS

2010-05-03 Thread MJ Suhonos
NB: When Karen Coyle, Eric Morgan, and Roy Tennant all reply to your thread within half an hour of each other, you know you've hit the big time. Time to retire young I think. That would be Eric *Lease* Morgan — oh my god, you're right! I'm already losing data! It *is* insidious! I

Re: [CODE4LIB] MODS and DCTERMS

2010-05-04 Thread MJ Suhonos
I'd just like to say a word of thanks for everyone who has contributed so far on this thread. The viewpoints raised certainly help clarify at least my understanding of some of the issues and concepts involved. MARCXML is a step in the right direction. MODS goes even further. Neither really

Re: [CODE4LIB] MODS and DCTERMS

2010-05-04 Thread MJ Suhonos
Let me give another example: the Open Library API returns a JSON tree, eg. http://openlibrary.org/books/OL1M.json But what schema is this? And if it doesn't conform to a standard schema, does that make it useless? If it were based on DCTERMS, at least I'd have a reference at

Re: [CODE4LIB] MODS and DCTERMS

2010-05-04 Thread MJ Suhonos
functionally requires semantics beyond those in the DCTERMS. All the better if some of those terms just happen to be available in Bibliontology or some other namespace... Thanks again, -Corey MJ Suhonos wrote: Let me give another example: the Open Library API returns a JSON tree, eg. http

Re: [CODE4LIB] Indexing MARC(-JSON) with MongoDB?

2010-05-13 Thread MJ Suhonos
There's been some talk in code4lib about using MongoDB to store MARC records in some kind of JSON format. I'd like to know if you have experimented with indexing those documents in MongoDB. From my limited exposure to MongoDB, it seems difficult, unless MongoDB supports some kind of custom

Re: [CODE4LIB] Indexing MARC(-JSON) with MongoDB?

2010-05-13 Thread MJ Suhonos
Sorry, meant to include this link, which compares Elastic Search and Solr: http://blog.sematext.com/2010/05/03/elastic-search-distributed-lucene/ MJ

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread MJ Suhonos
It's helpful to think of MARCXML as a sort of lingua franca. - Existing libraries for reading, manipulating and searching XML-based documents are very mature. Including XSLT and XPath; very powerful stuff. There's nothing stopping you from reading the MARCXML into a binary blob and working

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread MJ Suhonos
I'll just leave this here: http://www.indexdata.com/blog/2010/05/turbomarc-faster-xml-marc-records That trade-off ought to offend both camps, though I happen to think it's quite clever. MJ On 2010-10-25, at 3:22 PM, Eric Hellman wrote: I think you'd have a very hard time demonstrating any

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread MJ Suhonos
JSON++ I routinely re-index about 2.5M JSON records (originally from binary MARC), and it's several orders of magnitude faster than XML (measured in single-digit minutes rather than double-digit hours). I'm not sure if it's in the same range as binary MARC, but as Tim says, it's plenty fast

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-27 Thread MJ Suhonos
But it looks just like the old thing using insert data scheme and some templates? Ah yes, but now we're doing it in XML! I think this applies to 90% of instances where XML was adopted, especially within the enterprise IT industry. Through marketing or misunderstanding, XML was

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-28 Thread MJ Suhonos
be a non-alphanumeric attribute value in MARCXML? Is this a non-MARC21 thing? C On 10/25/10 3:35 PM, MJ Suhonos wrote: I'll just leave this here: http://www.indexdata.com/blog/2010/05/turbomarc-faster-xml-marc-records That trade-off ought to offend both camps, though I happen to think

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-28 Thread MJ Suhonos
The first comment claims a 30-40% increase in XML parsing, which seems obvious when you compare the number of characters in the example provided: 277 vs. 419, or about 34% fewer going through the parser. The speedup can be much greater than that -- from the blog post itself, Using

Re: [CODE4LIB] PHP MVC frameworks

2010-11-15 Thread MJ Suhonos
Hi all, I've actually worked with the Public Knowledge Project for many years, so just to shed a little light on the PHP framework that we use: Alec Smecher, our lead architect, has gone on record several times as saying that the last thing the world needs is another PHP framework. It came

Re: [CODE4LIB] ElasticSearch

2013-03-14 Thread MJ Suhonos
Likewise, I've been using it since mid-2010 (0.6.0). What do you want to know about it? MJ

Re: [CODE4LIB] ElasticSearch

2013-03-14 Thread MJ Suhonos
To these responses, I would also add: extremely easy to install and configure -- that is, NO configuration is required to get it running out-of-the-box (including schema definitions, servlet containers, etc.) This alone was what drew me to ES in lieu of Solr way back, though I don't know if it