Re: Sharepoint connector question

Martijn v Groningen Sun, 12 Sep 2010 13:51:43 -0700

Tomorrow I'll dive into code and do some more debugging. Last week I
didn't specify any mappings in the mapping tab for the meta data
fields I selected in the metadata tab. But this shouldn't be the
problem, right?


Thanks,

Martijn

On 12 September 2010 22:29, Karl Wright <daddy...@gmail.com> wrote:
> Martijn,
>
> (1) The precise svn url for the acf version of httpclient is as follows.  My
> apologies for any earlier confusion - I was away from my computer at the
> time.
>
> https://svn.apache.org/repos/asf/incubator/lcf/upstream/commons-httpclient-3x
>
> (2) Each time the solr connector posts into Solr, you should see a set of
> argument names and values dumped to standard out (or the log).  So it should
> be easy to see what is being sent, and whether the arguments in fact are the
> correct ones for the extracting update request handler, or not.
> Furthermore, the Solr output connector recently had a tab added which
> performs the mapping I alluded to.  This mapping is designed to translate
> metadata coming from a connector like SharePoint, into fields that you
> presumably have in your Solr schema.  However, if you don't set anything,
> the fields are not changed, and you should see an argument for every
> metadata field, something like: literal.xxx=yyy.
>
> If you have a document that you *know* has metadata, and you've specified
> that metadata in the job, and you run the job after you specify that
> metadata, but still see no literal.xxx=yyy corresponding to it in the Solr
> output, then we should spend some time chasing this problem down.  Be wary
> because incremental crawling means you'll probably not see your document
> processed again unless you either change it in SharePoint, or delete and
> recreate the job.  But be reassured that SharePoint metadata was covered by
> the old MetaCarta tests, and there have been no changes of any significance
> to the SharePoint connector since then, so I have no explanation why it
> would not work for you too.  That's why I'm spending time trying to figure
> out if this is a Solr connector issue instead.
>
> Please let me know if this helps you, or whether you need to go deeper into
> debugging.
>
> Karl
>
>
> On Sun, Sep 12, 2010 at 4:05 PM, Martijn v Groningen
> <martijn.is.h...@gmail.com> wrote:
>>
>> I didn't notice that I was under the upstream-changes directory.
>> Thanks for pointing that out.
>>
>> In Solr I have a wildcard (*) dynamic field, so everything acf sends
>> should end up in my index (or at least that is what I assume). I also
>> did some debugging in the Solr connecter and I noticed that no
>> metadata was send to Solr. I didn't create field mappings in my acf
>> job. Do you always have to make mapping for metadata?
>>
>> Martijn
>>
>> On 12 September 2010 21:09, Karl Wright <daddy...@gmail.com> wrote:
>> > The source for upstream changes is under
>> > lcf/upstream-changes/httpclient, not under trunk.
>> >
>> > As for the metadata, how are you determining that no metadata is being
>> > indexed?  If this is Solr you are indexing into, have you set up the
>> > appropriate metadata/field mappings?
>> >
>> > Karl
>> >
>> > On 9/12/10, Martijn v Groningen <martijn.is.h...@gmail.com> wrote:
>> >> To authenticate with Share point I had to include the domain as well.
>> >> Also the ui reported an error if I didn't specify the username in a
>> >> domain / username format. Maybe this http client issue was just
>> >> particular with the Sharepoint / Domain Controller installation I was
>> >> working with. I also couldn't find the source of afc version of http
>> >> client. Is it hosted in another source repository?
>> >>
>> >> I still don't understand why for the documents I crawled, I didn't
>> >> have any metadata associated with it. In the job configuration I was
>> >> able to choose which metadata I wanted to include. You have an idea
>> >> what might be the cause of this?
>> >>
>> >> Regards,
>> >>
>> >> Martijn
>> >>
>> >> On 12 September 2010 18:40, Karl Wright <daddy...@gmail.com> wrote:
>> >>> Hi Martijn,
>> >>>
>> >>> The ACF version of httpclient has support for NTLMv1, NTLMv2, and
>> >>> NTLM2
>> >>> protocols.  The standard client does not.
>> >>>
>> >>> What this means practically for you depends on how the Windows domain
>> >>> controller you are working with is configured.  You cannot use the
>> >>> off-the-shelf httpclient and still authenticate if the domain
>> >>> controller
>> >>> is
>> >>> configured to not allow LM connections, which is what Microsoft
>> >>> recommends
>> >>> people do.
>> >>>
>> >>> Since the ACF version of httpclient will always try to connect using
>> >>> NTLMv2,
>> >>> this means that you must be more rigorous about setting up your client
>> >>> machine.  First, it must have a name, and it must have a machine
>> >>> account
>> >>> in
>> >>> the domain.  Second, NTLMv2 is much more picky about how you specify
>> >>> user
>> >>> and domain.  The end user documentation provides details that may be
>> >>> helpful
>> >>> to you in this regard.
>> >>>
>> >>> Thanks,
>> >>> Karl
>> >>>
>> >>>
>> >>> On Sun, Sep 12, 2010 at 5:00 AM, Martijn v Groningen
>> >>> <martijn.is.h...@gmail.com> wrote:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> I've configured the Sharepoint connector (to connect to sharepoint
>> >>>> 3.0), Solr connector and a job that adds documents into Solr. The
>> >>>> only
>> >>>> thing that I'm missing is the meta data from Sharepoint. Per document
>> >>>> I need to know which users can access it. In the metadata tab on the
>> >>>> job page I've configured the metadata to be included, but this
>> >>>> doesn't
>> >>>> end up in my Solr index. Does anybody know what I should do to also
>> >>>> have the metadata in my index?
>> >>>>
>> >>>> I also had another issue with the Sharepoint connector which I
>> >>>> managed
>> >>>> to solve. But I'm curious to know if someone else encountered a
>> >>>> similar issue.
>> >>>> When I was setting up the sharepoint connecter I always got a 401
>> >>>> message on the connectors page as status. I was sure I entered the
>> >>>> correct credentials. After some debugging I noticed that the NLTM
>> >>>> data
>> >>>> that was send to Solr was different then when I did a http post with
>> >>>> Firefox poster plugin to a Sharepoint webservice url (I check this
>> >>>> with Wireshark). After writing a little test case with httpclient
>> >>>> used
>> >>>> in afc, I got the same 401 error. I then ran the test with a clean
>> >>>> http client (version 3.1), that ran as expected. I got a response
>> >>>> code
>> >>>> 200 back with a soap response. I then used this version of http
>> >>>> client
>> >>>> (with some class filesfrom the afc provided jar that were missing is
>> >>>> the plain jar file) and the connector worked as expected as I was
>> >>>> able
>> >>>> to index documents. Did someone else have this particular issue? I
>> >>>> noticed that acf is using httpclient 3.1 (from the manifest file),
>> >>>> but
>> >>>> I'm curious to know why http client was modified.
>> >>>>
>> >>>> BTW I've been using the latest trunk version (I did a checkout last
>> >>>> tuesday). I'm also new to Sharepoint
>> >>>>
>> >>>> Cheers,
>> >>>>
>> >>>> Martijn
>> >>>
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> Met vriendelijke groet,
>>
>> Martijn van Groningen
>
>



-- 
Met vriendelijke groet,

Martijn van Groningen

Re: Sharepoint connector question

Reply via email to