Hey Philippe,
Initially, the idea was to wait for the web page to load and then do the
link filtering. However, this lead to bad user experience as the links were
visible while the page loaded, but would suddenly disappear once the page
was loaded and filtering kicked in. Instead, I decided to perform the
filtering whenever new html content was added. That way, rb and mat links
are being removed from the page load start and they don't ever appear.
As for performance, it's not noticeable for any of the matrix groups. The
procedure goes smoothly even in the largest matrix group, the
Harwell-Boeing collection with 292 matrices. However, when one opens a page
that lists all available matrices (2700 of them in one page), there are
some noticeable performance issues. The html for such a page is huge, and
the page loading is slow by itself. With added filtering the browser
freezes for a couple of seconds, a couple of times. This comes as no
surprise since the filtering procedure uses pure brute force.
Here's how the filtering is done:
-whenever new html content is added: 1. fetch all available <a> html
elements; 2. iterate through each element and remove those that contain
".mat" or "/RB/" in their href;
Obviously, iterating through each and every <a> element whenever new
content is added is very inefficient. I could improve it by only filtering
<a> elements from the newly added html content. That should make it
seamless even for large html content.
As for the ".mat" and "/RB/" strings I use for filtering:
-there are only 2 differences between the 3 matrix file type links: 1. href
of links to mtx and rb types end with ".tar.gz", while links to mat end
with ".mat"; 2. mtx links have "/MM/", rb have "/RB/", and mat have "/mat/"
in their href; all other href parts are the same for each file type.
At first, I used "/mat/" and "/RB/" to remove mat and rb file links.
Unfortunately, there were other links that had "/mat/" in them, so they
were removed unintentionally. Using ".mat" instead, together with "/RB/",
seems to correctly remove only the download links to those files types.
Namik
On Wed, Jul 16, 2014 at 2:10 PM, Philippe Tillet <phil.til...@gmail.com>
wrote:
> Hey Namik,
>
> If you're filtering mat and rb files only, it could lead to some problems
> when some other links appear. Why not filtering out of the page everything
> which is not a mtx link?
> How long is the procedure? :-p
>
> Philippe
>
>
> 2014-07-15 22:15 GMT+02:00 Namik Karovic <namik.karo...@gmail.com>:
>
> Just a quick update.
>>
>> I added a rather dirty JavaScript hack to the MatrixMarket. Basically, it
>> checks each link of a loaded web page for mat and rb file formats and
>> removes them. Only the .mtx file format links are left untouched. That way
>> users can't download files other than .mtx. It's not pretty and it's a bit
>> slow when modifying large pages such as the one where all matrices are
>> listed (2700 of them), but it works. Further optimizations could make the
>> dynamic modifications seamless (it's full brute force JavaScript at the
>> moment).
>>
>> Let me know if this way of filtering .mtx files is acceptable, so I can
>> work on optimizing the procedure. If not, I'll start making a customized
>> matrix market html page.
>>
>> Regards, Namik
>>
>>
>> On Tue, Jul 15, 2014 at 2:44 AM, Namik Karovic <namik.karo...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>> The only button a user can click has to be a "download and run" button.
>>>> Everything more complicated is confusing and error-prone. This probably
>>>> means that some parsing or processing of the webpage is necessary. If
>>>> this means that we have to statically download and provide a subset of
>>>> the MatrixMarket, so be it...
>>>
>>> Hmm, the web page that lists all the matrices is around 17MB. I could
>>> make a stripped-down static html file, but that seems like a lot of
>>> (unnecessary) work. It might be better to dynamically alter html and remove
>>> the download buttons of each format, replacing them with a single "download
>>> and run" button. I'll try and see if it can be done that way.
>>>
>>> If you can decompress .tar and .gz, what's the problem with first
>>>
>>> decompressing the .gz portion and then the .tar? ;-)
>>>
>>> Ah. Well. Damn. My bad. I guess I was seeing it as a single file
>>> extension, rather than a combination of two...
>>>
>>> Yes, something in the user folder. You can't assume to have write
>>>> access
>>>
>>> to the installation folder.
>>>
>>> Sure.
>>>
>>> Is this tab needed? Can't this be integrated into the
>>>
>>> Matrix-Market-browser with a "download and run" button?
>>>
>>> Sure I could check if a selected matrix has already been downloaded, but
>>> one would have to first find it in the matrix market and then hit the
>>> "download and run" button. I was just thinking it might be easier to just
>>> browse through local matrices and use one of those, instead of searching
>>> for it in the matrix market. I dunno, just brainstorming.
>>>
>>> Regards, Namik
>>>
>>>
>>> On Mon, Jul 14, 2014 at 11:04 PM, Karl Rupp <r...@iue.tuwien.ac.at>
>>> wrote:
>>>
>>>> Hey,
>>>>
>>>> > I'd like some feedback on the MatrixMarket features.
>>>> >
>>>> > At this moment, the MatrixMarket Screen (I'll call it MM from this
>>>> > point) is a WebKit-based mini web-browser within the Benchmark GUI.
>>>> Its
>>>> > homepage is http://www.cise.ufl.edu/research/sparse/matrices/ and
>>>> one
>>>> > can use it to navigate from that page like any other normal web
>>>> browser.
>>>> > Navigating to other web pages is only possible by following links from
>>>> > the homepage. Since the address bar is not implemented, one can't get
>>>> > far by only following links from the homepage.
>>>> > My question is: should I add an address bar and allow users to visit
>>>> > other pages (e.g. facebook.com <http://facebook.com> =D )? Or is it
>>>> > preferred not to allow users to wander too far away from the market
>>>> > homepage (I'll add a home button that returns to this homepage) ?
>>>>
>>>> No, we want this to be a MatrixMarket-Browser, not a
>>>> general-purpose-browser ;-)
>>>>
>>>>
>>>>
>>>>
>>>> > The MM currently supports downloading .mtx files. There are a few
>>>> > peculiarities about this one:
>>>> > -There are 3 matrix formats available for download (MATLAB, Matrix
>>>> > Market, Rutherford/Boeing) and we need the Matrix Market format
>>>> (.mtx).
>>>> > As far as I can tell, all links to .mtx files contain the /MM/ string.
>>>> > The files are organized according to their format (/mat/ /MM/ /RB/).
>>>> So,
>>>> > I do the following: detect when a download is attempted, and filter
>>>> the
>>>> > link for /MM/. That way I can enable only .mtx files to be downloaded.
>>>> > The problem is that users have to know they may only download the .mtx
>>>> > files, as downloading other files is disabled.
>>>>
>>>> The only button a user can click has to be a "download and run" button.
>>>> Everything more complicated is confusing and error-prone. This probably
>>>> means that some parsing or processing of the webpage is necessary. If
>>>> this means that we have to statically download and provide a subset of
>>>> the MatrixMarket, so be it...
>>>>
>>>>
>>>> > -The .mtx files are downloaded as .tar.gz archives. That means I''l
>>>> have
>>>> > to use a third-party library to decompress them. Unfortunately, this
>>>> > isn't going so well. I tried using libarchive and zlib, but failed. I
>>>> > can decompress .zip, .tar, & .gz files with libarchive, but not
>>>> .tar.gz
>>>> > files.
>>>>
>>>> If you can decompress .tar and .gz, what's the problem with first
>>>> decompressing the .gz portion and then the .tar? ;-)
>>>>
>>>>
>>>>
>>>> > -The downloaded files are saved in MatrixMarket folder within the
>>>> > program's install folder. Any better suggestions on where to put these
>>>> > files? Maybe the current user's documents folder?
>>>>
>>>> Yes, something in the user folder. You can't assume to have write access
>>>> to the installation folder.
>>>>
>>>>
>>>> > -There's going to be a tab under MM in which users will be able to
>>>> > browse downloaded matrices and select them for usage in the benchmark.
>>>> > Manually added matrices will be select-able from this tab.
>>>>
>>>> Is this tab needed? Can't this be integrated into the
>>>> Matrix-Market-browser with a "download and run" button?
>>>>
>>>> Best regards,
>>>> Karli
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Want fast and easy access to all the code in your enterprise? Index and
>>>> search up to 200,000 lines of code with a free copy of Black Duck
>>>> Code Sight - the same software that powers the world's largest code
>>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>>>> http://p.sf.net/sfu/bds
>>>> _______________________________________________
>>>> ViennaCL-devel mailing list
>>>> ViennaCL-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>>>
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> ViennaCL-devel mailing list
>> ViennaCL-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>>
>
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
ViennaCL-devel mailing list
ViennaCL-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/viennacl-devel