The IRI is not working in my IE. I am using old version of IE V6 SP3. But what I realy want is to display the correct name of the path with hebrew characters. If I understand you right, then I need to change the representation of the IRI. How can I do that? On May 1, 2013 3:14 PM, "Karl Wright" <[email protected]> wrote:
> Right, that is exactly what I would expect. > > ManifoldCF uses a URL (which is constructed by the connector) as the > primary key for every document as indexed in the search engine. The URL > has two purposes: first, it is supposed to be unique, and second, it is > supposed to allow someone who browses to that result to locate the > document. In the case of JCIFS, the environment is presumed to be the > local active directory domain(s), and the "URL" generated is really a file > IRI, usually of the form "file://///server.domain/path/filename". You thus > should be able to paste the "URL" of the document from Solr into a browser > on a machine in the domain, and see the document load. > > As I said before, however, there are already certain problems with this > because each version of IE differs somewhat in how it deals with non-ASCII > characters. IRI legal character rules are somewhat different than URL > rules, but IRI's are still nevertheless escaped in various ways. There are > also multiple equivalent ways of representing the same file path with > different IRI's. > > It is not typical that the ID and URL fields of a document are presented > to the user in any meaningful way, so your question is usually academic in > most settings. If you have a problem with the IRI's not actually working > in a browser, that's of more immediate interest. Please let us know if > that's the case. > > Thanks, > Karl > > > On Wed, May 1, 2013 at 8:04 AM, Yossi Nachum <[email protected]> wrote: > >> Thanks for your response >> I am seeing these characters in solr when I search these files. >> I am using the solr example site and these characters show up in the ID >> field and URL field. >> BTW I am running solr and mcf on a linux server >> On May 1, 2013 1:11 PM, "Karl Wright" <[email protected]> wrote: >> >>> Where are you seeing these characters? Are you talking about the file >>> IRI's that the JCIFS connector generates? Those IRI's are supposed to be >>> constructed so that your browser would find them if you paste them into the >>> browser URL window. Unfortunately, there is no good standard, and people >>> follow IE's behavior, and IE has changed multiple times in how it deals >>> with non-latin-1 characters. >>> >>> Please provide a bit more information so that we can provide a better >>> answer. >>> >>> Karl >>> >>> >>> >>> On Wed, May 1, 2013 at 3:11 AM, Yossi Nachum <[email protected]>wrote: >>> >>>> Hello, >>>> I install search server with solr and manifoldcf. >>>> I want to index my netapp files over cifs and I have a problem with >>>> hebrew files and directories. >>>> When I search for these files in solr I see "%D7%91%D7%..." instead of >>>> the directory path that contain hebrew characters . >>>> I try to run the java process with "-Djcifs.encoding=cp1255" but it >>>> didn't help. >>>> Can anyone help and tell me how can I index directories/files in hebrew? >>>> >>>> Thanks >>>> Yossi >>>> >>> >>> >
