[CODE4LIB] R?

2009-09-10 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
 William == William Denton w...@pobox.com writes:
William Are any of you using R?  http://www.r-project.org/

I use R for a number of things, including the multidimensional
scaling (512--2) I do here:
 http://zzzoot.blogspot.com/2009/07/project-torngat-building-large-scale.html 

It is fast, backed by the stats braniacs, has a huge number of
domain-specific modules (biology, genomics, geology, engineering,
).

It is great. Slices bread, juliennes fries, casts my votes, does my
taxes, feeds my dogs and submits my postings to code4lib.  ;-)

-glen



 William == William Denton w...@pobox.com writes:

William Are any of you using R?  http://www.r-project.org/

WilliamBlog about R, info viz, etc.:
William http://blog.revolution-computing.com/

William I have something in mind I'm going to try fooling around
William with in R, but I wondered if anyone was using it for
William visualizing searches, usage, networks of information,
William that kind of thing.

William Bill -- William Denton, Toronto : miskatonic.org
William www.frbr.org openfrbr.org

-- 

Glen Newton | glen.new...@nrc-cnrc.gc.ca
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy

2009-08-05 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I should hope that Google is smart enough to look at the http Via
header[1] and allowing bigger caps for proxying HTTP requests.

On the other hand:
 1) Google decides to have differential caps for proxying requests
 2) People figure out that they could grab more pretending to be a
proxy by inserting this header field into their HTTP requests
 3) Google caught on and went back to one cap to bind them all...

BTW, if #1 is true and #2 and #3 are not yet true, then they soon will be!  ;-)

Glen Newton
http://zzzoot.blogspot.com/

[1]http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.45


 I suspect that proxying Google will trigger an automatic throttle.
 Early on, a number of us hit GB hard, trying to figure out what they
 had, and got stopped.
 
 Tim
 
 On Wed, Aug 5, 2009 at 9:59 AM, Eric Hellmane...@hellman.net wrote:
  Recent attention to privacy concerns about Google Book Search have led me to
  investigate whether any libraries are using tools such as proxy servers to
  enhance patron privacy when using Google Book Search. Similarly, advertising
  networks (web bugs, for example) could be proxied for the same reason. I
  would be very interested to hear from any libraries that have done either of
  these things and of their experiences doing so.
 
 
  Eric Hellman
  President, Gluejar, Inc.
  41 Watchung Plaza, #132
  Montclair, NJ 07042
  USA
 
  e...@hellman.net
  http://go-to-hellman.blogspot.com/
 
 
 
 
 -- 
 Check out my library at http://www.librarything.com/profile/timspalding


Re: [CODE4LIB] proxying Google Book Search and advertising networks to protect patron privacy

2009-08-05 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
It may/should help protect the user's privacy from the server end (from
Google), not the client end. 

In the original question there is an underlying (perhaps true ;-)  )
assumption that librarians are more trustworthy than Google. 

-glen :-)


 Nate == Nate Vack njv...@wisc.edu writes:

Nate Are you talking about proxying connections from library
Nate computers?  For computers in the library, how does a proxy
Nate help with privacy? How is a person linked to a web request?
Nate Login records? Video footage? If either of those, wouldn't
Nate purging the records be the simplest way to provide privacy?

Nate Sorry if I'm being dumb, but I don't understand how proxying
Nate helps privacy in this context.

Nate Cheers, -Nate

Nate On Wed, Aug 5, 2009 at 8:59 AM, Eric
Nate Hellmane...@hellman.net wrote:
 Recent attention to privacy concerns about Google Book Search
 have led me to investigate whether any libraries are using
 tools such as proxy servers to enhance patron privacy when
 using Google Book Search. Similarly, advertising networks (web
 bugs, for example) could be proxied for the same reason. I
 would be very interested to hear from any libraries that have
 done either of these things and of their experiences doing so.
 
 
 Eric Hellman President, Gluejar, Inc.  41 Watchung Plaza, #132
 Montclair, NJ 07042 USA
 
 e...@hellman.net http://go-to-hellman.blogspot.com/
 


[CODE4LIB] Semantic Maps of Science from Full-text

2009-07-29 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I thought this community would find this interesting: using large
scale journal full-text to create a visualization of the journal
space (like 'Maps of Science' thingies). 5.7M full-text STM articles
from 2200+ journals. 

We're pretty excited about this, but I won't rant on about it any
more (here!): for more info: 
 http://zzzoot.blogspot.com/2009/07/project-torngat-building-large-scale.html

Let me know if you have any questions...

Glen

-- 
Glen Newton | glen.new...@nrc-cnrc.gc.ca
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


[CODE4LIB] code4lib open source software award

2009-03-06 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I also think this is a good idea. I'd like to comment on the straw
man:

   * Regarding who is eligible, I suggest it be
 individuals, teams, or corporate entities.
 Awardees must be willing to serve on the
 next year's nominating committee.

Awardees should be changed to nominees: If you are nominated and
not willing to server on next year's committee, then you must resign
from the process before the award is given out.

I am not sure about corporate entities: I do think the teams from a
corporate entity should be eligible, but the organization as-a-whole
should not get the award. Instead we want to recognize those
individuals in the corporate entity who actually built (and likely
championed) the software. I think this has more importance and is more
consistent with the values of the community.  

   * Regarding what is eligible, I suggest the
 software be open source, directly
 library-related, and developed within the
 past two years.

1 - Truly Open Source: only using a license recognised by OSI[1] is
acceptable. Let's be explicit to avoid confusion.
2 - I would suggest first released in the last 3 years. This
supports new activities, and gives them more chance to get
traction in the community. Sometimes things immediately take off;
other times they take time to make it.
3 - Directly library related is problematic. It could rule out some
significant contributions. I would instead say something like
Directly impacting libraries. 

   * Regarding the timing, I suggest this be an
 annual award given at each Code4Lib
 conference.

Sounds good.

-Glen Newton

[1]http://www.opensource.org/

-- 

Glen Newton | glen.new...@nrc-cnrc.gc.ca
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--



From: Eric Lease Morgan emor...@nd.edu
Sender:   Code for Libraries CODE4LIB@LISTSERV.ND.EDU
To:   CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] code4lib open source software award
Date: Thu, 5 Mar 2009 19:52:42 -0500

As a community, let's establish the Code4Lib Open Source Software Award.
Lot's of good work gets produced by the Code4Lib community, and I believe it
is time to acknowledge these efforts in some tangible manner. Our profession
is full of awards for leadership, particular aspects of librarianship,
scholarship, etc. Why not an award for the creation of software? After all,
the use of computers and computer software is an essential part of our
day-to-day work. Let's grant an award for something we value -- good,
quality, open source software.

While I think the idea of an award is a laudable one, I have more questions
than answers about the process of implementing it. Is such a thing
sustainable, and if so, then how? Who is eligible for the award? Only
individuals? Teams? Corporate entities? How are awardees selected?
Nomination? Vote? A combination of the two? What qualities should the
software exemplify? Something that solves a problem for many people?
Something with a high cool factor? Great documentation? Easy to install?
Well-supported with a large user base? Developed within the past year?

As a straw man for discussion, I suggest something like the following:

  * Regarding selection, I suggest there be a
committee who solicits nominations and
selects the awardee(s). As the years go by
an individual from the committee drops off
and the/an awardee becomes a member.

  * Regarding who is eligible, I suggest it be
individuals, teams, or corporate entities.
Awardees must be willing to serve on the
next year's nominating committee.

  * Regarding what is eligible, I suggest the
software be open source, directly
library-related, and developed within the
past two years.

  * Regarding the timing, I suggest this be an
annual award given at each Code4Lib
conference.

These are just suggestions to get us started. What do you think? Consider
sharing your thoughts as comments below, in channel, or on the Code4Lib
mailing list.

--
Eric Lease Morgan
University of Notre Dame


Re: [CODE4LIB] code4lib open source software award

2009-03-06 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
Jonathan Rochkind wrote:
It might be a good idea, but maybe not with the Code4Lib name. But I worry
in general we don?t collectively know enough about what makes good software
to give a Software of the Year honor reliably.

Karen Schneider wrote:
On the one hand, I agree. On the other hand, just to note, there was a
breakout session at C4L where quality issues of OSS were discussed;

I respectfully disagree: I don't see this as a software quality award.

I would suggest some criteria for judging software for this
award. Software that is:
 1 being used by a significant portion of the community
 2 filling a significant need
 3 having a positive impact on the community
 4 supportive others participating/collaborating
 5 responsive to the community
 6 (to a lesser extent) innovative
 7 your criterion here

I also have no problem with an award with code4lib on it. It is just
saying that the awardee software has been found to be see
1,2,3,4,5,6,7... above by the code4lib award committee.

-glen


[CODE4LIB] Announcement: LuSql: Database to Lucene indexing

2008-11-17 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I am proud to announce LuSql:

LuSql is a simple but powerful tool for building Lucene indexes from 
relational databases. It is a command-line Java application for the 
construction of a Lucene index from an arbitrary SQL query of a
JDBC-accessible SQL database. It allows a user to control a number of
parameters, including the SQL query to use, individual
indexing/storage/term-vector nature of fields, analyzer, stop word
list, and other tuning parameters. In its default mode it uses
threading to take advantage of multiple cores.

LuSql can handle complex queries, allows for additional per record
sub-queries, and has a plug-in architecture for arbitrary Lucene
document manipulation. Its only dependencies are three Apache Commons
libraries, the Lucene core itself, and a JDBC driver.

LuSql has been extensively tested, including a large 6+ million
full-text  article metadata document collection, producing an 86GB
Lucene index. 

http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql

If you have any questions, please contact me.

Thanks,

Glen Newton :-)

-- 
Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


[CODE4LIB] amazon s3?

2008-11-10 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
While not a personal experience, I think that the NY Times effort to
convert 4TB of TIFFS to 11 million PDFs using Hadoop + EC2 + S3 might
be of interest:
 http://zzzoot.blogspot.com/2008/02/hadoop-ec2-s3-super-alternatives-for.html

Glen

 Hi Folks,
 Anybody doing mass storage for their library/consortium on amazon s3?
 
 Anybody rejected it as an idea?
 
 Willing to share?  Please do.
 
 Tim
 
 +++
 Tim Shearer
 
 Web Development Coordinator
 The University Library
 University of North Carolina at Chapel Hill
 [EMAIL PROTECTED]
 919-962-1288
 +++

-- 

Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


[CODE4LIB] Tag Cloud inspired HTML Select lists

2008-10-16 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
This is something I did a little while ago, but thought some on this
list might find it interesting:

http://zzzoot.blogspot.com/2007/10/tag-cloud-inspired-html-select-lists.html

Glen 

-- 
Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


[CODE4LIB] List of Digital Library Conference Proceedings

2008-07-18 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
This may be tangential to this list...

I've just posted list of major digital library conference proceedings: 
 http://zzzoot.blogspot.com/2008/07/list-of-digital-library-conference.html

If there are any that are missing, please let me know  I will add
them.

Thanks,

Glen

-- 

Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


Re: [CODE4LIB] [c4lj-articles] SPARC Europe and the DOAJ Announce the Launch of the SPARC Europe Seal for Open Access Journals

2008-04-25 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I touch on the text mining etc. needs of researchers in two recent blog entries:

- FREE THE ARTICLES! (Full-text for researchers  scientists and their
  machines) 
http://zzzoot.blogspot.com/2008/04/free-articles-full-text-for-researchers.html
- New Open Access Criterion: Support access by machines (m2m)
  http://zzzoot.blogspot.com/2008/04/new-open-access-criterion-support.html

-Glen

--
Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST)
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6
Government of Canada | Gouvernement du Canada
--
 Jonathan == Jonathan Rochkind [EMAIL PROTECTED] writes:

Jonathan An announcement from the DOAJ that we got at the
Jonathan Code4Lib Journal, since we're listed in the DOAJ.

Jonathan I forward it to you all because it's related to the
Jonathan on-going discussion some of us are having about how the
Jonathan heck can we get our software to find open access
Jonathan versions of articles. Certainly not close to a fix-all
Jonathan even if their project is succesful, but addresses one
Jonathan component of one subset of open access material.

Jonathan Jonathan

Jonathan doaj-team wrote:
 Lund Sweden 23 April 2008 Important news for all publishers who
 have journals listed in the Directory of Open Access Journals
 (DOAJ)

 Dear publishers of journals listed in the Directory of Open
 Access Journals (DOAJ)

 We --the team behind the DOAJ-- are approaching you to inform
 about two important issues.

 Firstly, as you probably are aware of, there is a growing
 discussion and attention to open access to scholarly
 information in the research community. The current discussion
 is concentrating on open access in a broader sense than just
 free access to journal articles.

 In order for research to be really open, researchers need more
 than just to get free access to the articles -- that is more
 than free-to-read. Researchers are increasingly demanding and
 expecting to be able to reuse not only the text in various
 ways, but increasingly to be able to do text- and data mining
 in order to more efficiently extract and discover fractions of
 the content (i.e. for instance acronyms for genes, proteins,
 abbreviations etc.) and to uncover hidden relations between
 such fractions by automated computing.

 In order for open access journals to be even more useful and
 thus receive more exposure and provide more value to the
 research community it is very important that open access
 journals offer standardized, easily retrievable information
 about what kinds of reuse are allowed.

 Creative Commons offers a number of licenses that in a
 standardized way makes it very easy for content providers to
 offer information about these issues. More information about
 this under Step 1 below.

 Secondly, SPARC Europe and The Directory of Open Access
 Journals (operated by Lund University, Sweden) have entered an
 agreement about introducing a certification scheme for Open
 Access journals, the SPARC Europe Seal for Open Access
 Journals.

 The intention of the scheme is to motivate open access journals
 to deliver metadata to DOAJ. The DOAJ team will then convert
 the metadata into standardized XML-format and OAI-compliant
 format, which will further increase the visibility of articles
 and provide means for the easiest possible dissemination thus
 reaching more readers, attracting more authors, gaining more
 prestige and impact.

 The team behind the DOAJ will offer various forms of assistance
 and guidance in this respect.

 What are the advantages of having the SPARC Europe Seal?

 Improved information as to what users are allowed to do with
 papers published in your journal(s).

 Possible long-term archiving of your content, which makes
 publishing in your journal more attractive to authors.

 Better exposure as a high-quality journal based on state-of-the
 art dissemination technologies.

 The DOAJ team converts your metadata and makes the metadata
 harvestable, which means the widest possible dissemination and
 thus increased usage and impact.


 How to be approved:

 Step 1:

 Choose the Creative Commons License CC-BY license.



 In order to qualify for the SPARC Europe Seal you must apply
 the CC-BY license, which is 

Re: [CODE4LIB] KR

2008-04-04 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
The signal-to-noise ration is dropping on this list. Perhaps this
extremely humorous discussion could be taken off-list?

constructively,
Glen

 Mark == Mark Sandford [EMAIL PROTECTED] writes:

Mark 01010111 01101000 01100101 01101110 0010 0001
Mark 0110 01110101 0010 01110011 01110100 0111
Mark 01110010 01110100 0010 01110100 0110 0010
Mark 01100100 01110010 01100101 0111 01101101 0010
Mark 01101001 01101110 0010 01100010 01101001 01101110
Mark 0111 01110010 0001 00101100 0010 0001
Mark 0110 01110101 0010 01101011 01101110 0110
Mark 01110111 0010 0001 0110 01110101 0010
Mark 01101000 0111 01110110 01100101 0010 0111
Mark 01110010 0110 01100010 01101100 01100101 01101101
Mark 01110011 00101110

Mark 01001101 0111 01110010 01101011

Mark On Thu, Apr 3, 2008 at 2:53 PM, Ryan Ordway
Mark [EMAIL PROTECTED] wrote:
 #include stdio.h main(t,_,a) char *a; {
 return!0t?t3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
 1,t_?main(t+1,_,a):3,main(-94,-27+t,a)t==2?_13?
 main(2,_+1,%s %d %d\n):9:16:t0?t-72?main(_,t,
 @n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n
 +,/#\ ;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
 q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw'
 i;# \ ){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw'
 iK{;[{nl]'/w#q#n'wk nw' \ iwk{KK{nl]!/w{%'l##w#' i;
 :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
 ;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}
 {rl#'{n' ')# \ }'+}##(!!/)
 :t-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
 :0t?main(2,2,%s):*a=='/'||main(0,main(-61,*a, !ek;dc
 [EMAIL PROTECTED]'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m
 .vpbks,fxntdCeghiry),a+1); }



 On Apr 3, 2008, at 8:54 AM, Jeremy Frumkin wrote:

  ..- .-.. .-..  .. .. -- --. --- .. -. --.  - --- ... .- 
 -.-- .-  -... --- ..- - -  .. ...  -  .-. . .- -..
 .. ...   -  .- -  -. --- -. .  --- ..-.  -.--  --- ..-
 ... ..- ..-. ..-. . .-.  ..-. .-.   --- -- .-. -- ..  -  .
 .-- .- -.-- ..  -..   --- .--  . -.   ..  ..- ... .  --
 -.-- .--. .-. . ..-. . .-. .-. . -..  ..
 
 
 
  -. .--. ..-  - -.. . ...- .. -.-. . .-.-.- .-.-.- .-.-.-
 
  -- -- .--- .- ..-.
 
 
  On 4/3/08 6:51 AM, Walter Lewis [EMAIL PROTECTED] wrote:
 
 
   Sebastian Hammer wrote:
  
   
 A true hacker has no need for these crude tools. He
 waits for cosmic radiation to pummel the
 magnetic patterns on his drive into a pleasing
 and functional sequence of bits.

   
   Alas, having been doing this (along with my partners, the
 four   Yorkshiremen) since the Stone Age ...
  
   We used to arrange pebbles in the middle of road into the
 relevant   patterns (we *dreamed* of being able to afford the
 wire for an   abacus).Passing carts would then help
 crunch the numbers.
  
   Walter   for whom graph paper, templates, pencils, 80
 column punchcards and   IBM Assembler were formative
 experiences
  
  
 
 
 
  ===  Jeremy
 Frumkin  Head, Emerging Technologies and Services  121 The
 Valley Library, Oregon State University  Corvallis OR
 97331-4501
 
  [EMAIL PROTECTED]
 
  541.602.4905  541.737.3453 (Fax) 
 ===   Without
 ambition one starts nothing. Without work one finishes 
 nothing.   - Emerson
 
 


 --
 Ryan Ordway E-mail: [EMAIL PROTECTED] Unix Systems
 Administrator [EMAIL PROTECTED] OSU Libraries,
 Corvallis, OR 97331 Office: Valley Library #4657




--
Mark Sandford
Mark Special Formats Cataloger William Paterson University
Mark (973)270-2437 [EMAIL PROTECTED]


[CODE4LIB] many processes, one result

2008-02-18 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
 How do I write a computer program that spawns many processes but
 returns one result?

 I suppose the classic example of my query is the federated search. Get
 user input. Send it to many remote indexes. Wait. Combine results.
 Return. In this scenario when one of the remote indexes is slow things
 grind to a halt.

 I have a more modern example. Suppose I want to take advantage of many
 Web Services. One might be spell checker. Another might be a
 thesaurus. Another might be an index. Another might be a user lookup
 function. Given this environment, where each Web Service will return
 different sets of streams, how do I query each of them simultaneously
 and then aggregate the result? I don't want to so this sequentially. I
 want to fork them all at once and wait for their return before a
 specific time out. In Perl I can use the system command to fork a
 process, but I must wait for it to return. There is another Perl
 command allowing me to fork a process and keep going but I don't
 remember what it is. Neither one of these solutions seem feasible. Is
 the idea of threading in Java suppose to be able to address this
 problem?

Yes. I do this thing all the time for various things (and taking
advantage of multi-cpu and multi-core). Java threading is more
lightweight than forking.

---
MyThread[] myThreads = new MyThreads[20];

// start all threads
for(int i=0; i20; i++)
{
MyThread m = new MyThread();
m.start();
myThreads[i] = m;
}

for(int i=0; i20; i++)
{
// wait for each to complete. Note that a thread may be
// completed before this method is called.
myThreads[i].join()
}

Note that there is a join(long timeoutMillis) method.
Note that the threads can be doing all sorts of different things (like
the situation you describe).

-Glen

--

Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST)
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6
Government of Canada | Gouvernement du Canada
--


Re: [CODE4LIB] administratativia

2008-02-10 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
Ditto.  :-)

Glen Newton

Kevin Clarke wrote:
 Welcome John,
 It's nice to have more Java folks around :-)

 Kevin


 On Feb 9, 2008 11:13 AM, John Fereira [EMAIL PROTECTED] wrote:
  Roy's message to the web4lib list gave me a nudge that I should probably
  subscribed to Code4Lib.  Then I had a meeting yesterday in which I was asked
  to start working on a new project that would require that I do more library
  specific development (I've mostly been developing applications for the
  international agriculture community for the past few years).  The kicker
  though was reading that Eric has decided to come over to the dark side and
  has started to do some development in Java and I figured he could use some
  moral support.
 
  As a bit of an introduction...
 
  I work at Mann Library, one of the 20 unit libraries at Cornell University.
 Albert R. Mann Library is the library for the College of Agriculture and
  Life Sciences, thus my work in the Agriculture community.  My official title
  is Programmer/Analyst Specialist, however for the past couple of years I
  have served as the Technology Strategist/Systems Architect for our
  department as well.  I develop web and standalone applications almost
  exclusively in Java primarily using the Spring Framework and SOA practices.
 
  I have been an active participant in the open source community primarily
  with my association with JA-SIG for about six years.  I have been serving on
  the planning committee for the upcoming JA-SIG conference (I'll have to see
  if an announcement for it has been posted yet) and have been on the
  committee for the past four conferences.  I've served as the jasig.org (and
  uPortal.org) webmaster for several years.
 
  Prior to working at Cornell I worked in the corporate world going back 30
  years, including 13 years at the Hewlett Packard workstation division (where
  I set up their first TCP/IP network), but my first job in the electronics
  industry was working on a production line building home Pong games at Atari.
 
  I have been an active participant in the Usenet community since 1985.
 



 --
 There are two kinds of people in the world: those who believe there
 are two kinds of people and those who know better.



[CODE4LIB] Whatbird Interface Framework

2007-12-18 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
Hi Michael,

Taxonomic dichotomous (or binary) keys
(http://en.wikipedia.org/wiki/Dichotomous_key) and synoptic keys
(http://pyrenomycetes.free.fr/hypoxylon/keydir/synoptickey.htm) have a
number of implementations on the web and there is a significant body
of research and software out there. I did some graduate work in this
area (in my previous incarnation I was a biologist, ecologist/taxonomist).

Examples:
- http://www.alicesoftware.com/Products.htm
- DELTA (DEscription Language for TAxonomy) http://www.delta-intkey.com/
- http://ctap.inhs.uiuc.edu/dmitriev/index.asp

That said, I think creating a generic framework would be a good idea.

I might be interested, but I am a Java guy.  :-(

Glen

--
Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/t??l: 613-990-9163 | facsimile/t??l??copieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST)
Conseil national de recherches Canada | M-55, 1200 chemin Montr??al
Ottawa, Ontario K1A 0R6
Government of Canada | Gouvernement du Canada
--





 Michael == Michael Beccaria [EMAIL PROTECTED] writes:

Michael Hey all, I'm considering trying to create a
Michael framework\tool to allow people to create a whatbird.com
Michael like interface for other types of datasets (plants,
Michael trees, anything really).

Michael The idea is to create a framework allowing users to
Michael create a discovery tool with attribute selections to
Michael narrow down the result set. So, for example, our
Michael faculty/students would identify attributes found in all
Michael trees (leaf shape, fruit, bark, form, etc.) and then
Michael input this data into the tool which would then allow them
Michael to input actual trees and associate them with the
Michael attributes (as well as input description info, pictures,
Michael etc.). The end result would look something like
Michael whatbird.com does with birds.

Michael This will be a challenge for me (but a good one). My
Michael thought is to use a web framework like Django (picked
Michael because I know it a little) but am unsure if you can have
Michael it organize the database tables with the relationships
Michael properly. I considered using solr but thought it would be
Michael overkill considering the relatively small datasets this
Michael tool would be used to create (under 1000 objects) but in
Michael the end it might be a good bet. If approved (I have to
Michael talk to the dean of our forestry department to see if he
Michael will buy into the idea) I will try and create the bulk of
Michael it during January and tweak it the rest of the semester.

Michael Anyone interesting in working on this type of project
Michael with me?

Michael Mike Beccaria Systems Librarian Head of Digital
Michael Initiatives Paul Smith's College 518.327.6376
Michael [EMAIL PROTECTED]