[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467111#comment-17467111 ]
Lewis John McGibbney commented on NUTCH-2856: --------------------------------------------- Adding some notes from my research. * The smbj API looks very intuitive I think it will be a great fit * I was concerned about acquiring a SMB server which could be used for integration tests. Luckily the smbj project does have integration tests which show hwo this can be done but there were some missing pieces. They create an SMB (samba) server via Docker however they did not publish the image. Luckily a fellow Tika PMC took the initiative to [clone|https://github.com/nddipiazza/smbj-docker] and [publish|https://hub.docker.com/r/ndipiazza/smbj-inttest] it. * In the Gora project, we've been using [testcontainers|https://www.testcontainers.org/] for some time. This allows us to perform integration testing easily as you can either run a precanned container or you can [arbitrarily define one|https://www.testcontainers.org/features/creating_container/]. In this case, I can simply reference _ndipiazza/smbj-inttest_ and then test against it. There is a downside to this however, the host running the tests must have Docker installed. I need to therefore figure out a means of running this particular integration test only if the host has Docker installed and skipping it otherwise. > Implement an appropriately licensed protocol-smb plugin > ------------------------------------------------------- > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol > Reporter: Hiran Chaudhuri > Assignee: Lewis John McGibbney > Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)