On Sat, Jun 6, 2009 at 4:10 PM, Christiaan Hofman <cmhof...@gmail.com> wrote: > > On Jun 6, 2009, at 10:25 PM, Michael McCracken wrote: > >> On Sat, Jun 6, 2009 at 5:33 AM, Sven-S. Porst >> <ssp-li...@earthlingsoft.net> wrote: >>>> No, it's not possible to do async parsing in a web parser. Remember, >>>> we don't fully control the (async) web page loading. >>> >>> >>> Can you elaborate a bit on this? It would seem advantageous to do the >>> parsing asynchronously and so far I thought we can't do it because >>> DOMDocument doesn't like running on other threads. Your comment >>> sounds >>> like there are further reasons for this which aren't apparent to me >>> yet. >> >> I'm also curious. I was thinking of a limited concurrency, where the >> call to itemsFromDocument:... was still synchronous, but I would issue >> 50 (25 items per page x 2) concurrent requests to get the individual >> pages (for the PDF link) and their bibtex. >> >> I hadn't tried it because I wasn't sure what the slowest part was - >> downloading the data or parsing it. And this sounds like parsing it >> has to be done on the main thread? That wouldn't work so well. >> >> -mike >> > > Perhaps it would be possible, but it would be very complex to get it > to work reliably. If itemsFromDocument:... is sync while the methods > it calls are not, it somehow should block the main thread. However > there's the problem that frames that may finish loading can come in > async, and can be reloaded. And that's also potentially any subframe. > When some (sub)frame starts to reload, a current async parser should > be stopped. I'm not even clear whether one can determine safely how > they should be stopped, given that different frames can trigger > separate parsers. And surely the web parser currently has no API to > handle async parsing (and stopping). And I'm really not sure if > parsing is safe on a separate thread, especially if there can be all > kind of different parsers around that all have to be sure to be thread > safe (we cannot now guarantee that because we never have had to > bother). Also maybe we should consider that several Cocoa URL > downloading classes have thread safety bugs. Anyway, there are several > pretty complicated things to consider. I personally would not think it > worth opening that can of worms.
OK, sounds convincing. So I think maybe what we need is a way for a parser to tell the web parsing UI that a page is a TOC page, so a message can be displayed saying that clicking on individual pubs will show the items to import. I saw that another parser just returned a 'dummy' bibitem with a message, but I don't like that solution for the long term. How about this idea - change '(BOOL)canParseDocument:' to + (int)canParseDocument:(DOMDocument *)domDocument xmlDocument:(NSXMLDocument *)xmlDocument fromURL:(NSURL *)url reason:(NSString **)reason; and the return value tells us one of these: 0 = can't 1 = won't (see reason) 2 = can and will or something like that. Then "reason" can be displayed for any page that won't generate items, so no one's expecting any. And if everyone returns 0, we can display something useful the same way then too. This'd be a good place to let the google scholar parser remind people to set the pref - it'd return 1 (won't) and set the reason to a string reminding people of that. How does that sound? While I'm tossing ideas around, the bottom two panes could hide if there aren't items, what do you think about that? I didn't do that out of lack of time, but it'd make browsing within bibdesk nicer. Cheers, -mike > Christiaan > >>> Sven >>> >>> -- >>> Sven-S. Porst . http://earthlingsoft.net/ssp . AIM: cv47al >>> Pass as best inventor! >>> >>> >>> ------------------------------------------------------------------------------ >>> OpenSolaris 2009.06 is a cutting edge operating system for >>> enterprises >>> looking to deploy the next generation of Solaris that includes the >>> latest >>> innovations from Sun and the OpenSource community. Download a copy >>> and >>> enjoy capabilities such as Networking, Storage and Virtualization. >>> Go to: http://p.sf.net/sfu/opensolaris-get >>> _______________________________________________ >>> Bibdesk-develop mailing list >>> Bibdesk-develop@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/bibdesk-develop >>> >> >> ------------------------------------------------------------------------------ >> OpenSolaris 2009.06 is a cutting edge operating system for enterprises >> looking to deploy the next generation of Solaris that includes the >> latest >> innovations from Sun and the OpenSource community. Download a copy and >> enjoy capabilities such as Networking, Storage and Virtualization. >> Go to: http://p.sf.net/sfu/opensolaris-get >> _______________________________________________ >> Bibdesk-develop mailing list >> Bibdesk-develop@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/bibdesk-develop > > > ------------------------------------------------------------------------------ > OpenSolaris 2009.06 is a cutting edge operating system for enterprises > looking to deploy the next generation of Solaris that includes the latest > innovations from Sun and the OpenSource community. Download a copy and > enjoy capabilities such as Networking, Storage and Virtualization. > Go to: http://p.sf.net/sfu/opensolaris-get > _______________________________________________ > Bibdesk-develop mailing list > Bibdesk-develop@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bibdesk-develop > ------------------------------------------------------------------------------ OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get _______________________________________________ Bibdesk-develop mailing list Bibdesk-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bibdesk-develop