Re: svn/git for website

2020-10-28 Thread Dave Fisher
Hi -

I may have helpful ideas. Tell me where the Tomcst site repository and build 
are located.

Regards,
Dave

Sent from my iPhone

> On Oct 27, 2020, at 12:18 PM, Christopher Schultz 
>  wrote:
> 
> Konstantin,
> 
>> On 10/26/20 20:47, Konstantin Kolinko wrote:
>> пт, 2 окт. 2020 г. в 00:09, Mark Thomas :
>>> 
>>> Hi all,
>>> 
>>> The topic came up at the BoF session at the end of the Tomcat track of
>>> migrating the website from svn to git. There were strong opinions both
>>> for migrating and for sticking with svn.
>>> 
>>> As a middle ground I'd like to propose we ask Infra to create a git
>>> mirror of the svn repo.
>>> 
>>> For those who favour git:
>>> The git mirror would be read-only but it would be possible to:
>>> - clone the git mirror
>>> - make changes in git
>>> - use git-svn to commit those changes back to svn
>>> - then the mirror automatically replicates them back to git
>>> 
>>> For those who favour svn there would be no change.
>>> 
>>> If there is agreement on this approach, I volunteer to contact infra to
>>> get it set up.
>> My proposal at BoF was for a partial mirror.
>> The issue is that
>> 1. I think that this mirror is intended as a tool to collect feedback
>> / patches from random people, and to lower barriers for contribution.
>> 2. The full Tomcat site is large. It includes documentation for all
>> versions of Tomcat, including javadocs. Those pages are changed rarely
>> and are not needed for people who contribute small changes for the
>> site. The source code for those pages is elsewhere.
> 
> The question I have to ask, here is: why do we bother putting all those files 
> in revision-control? The users guide for 4 different versions of Tomcat is 
> not a problem, but the javadocs are just stupid to store.
> 
> Is there some policy we are following by having all those files in there? Or 
> is it just to make sure that website "publication" is as simple as "svn 
> checkout"?
> 
>> 3. Subversion has easy commands to cope with such large source trees.
>> This feature is called "sparse checkouts".
>> For our site the necessary commands are documented in README.txt.
>> Essentially, it is done with --depth and --set-depth arguments to "svn
>> checkout" and "svn update" commands
>> Speaking about Git, there are huge repositories [1] out there, but I
>> think that the majority of people are not accustomed to them.
>> [1] https://en.wikipedia.org/wiki/Monorepo
>> I see that Git developers recently did some work to make dealing with
>> such repositories simpler, with addition of "git sparse-checkout"
>> command in Git 2.25.0 [2], released in January 2020.
>> [2] https://github.com/git/git/blob/v2.25.0/Documentation/RelNotes/2.25.0.txt
>> Though I think that support in tools is still lacking. E.g. missing in
>> TortoiseGit. [3]
>> [3] https://gitlab.com/tortoisegit/tortoisegit/issues/1599
>> If we go with a full Git mirror or with migration to Git, then I think
>> that somebody has to prepare an update to README.txt.
>> If we go with a partial Git mirror, I think it could be named
>> "tomcat-site-dev", reserving the name "tomcat-site" for a full mirror
>> if we ever make one.
>> Ignored paths for git-svn are configured with "--ignore-paths"
>> argument or with "svn-remote..ignore-paths" configuration
>> option. [4]
>> [4] https://git-scm.com/docs/git-svn
>> Other notes:
>> 4. Release managers use Subversion to publish the binaries.
>> Thus I expect that they are able to update the published documentation
>> with Subversion as well.
>> 5. Publishing the javadocs generates small changes over a large number
>> of files. The script that generates the commit email notes that the
>> diff is huge and trims it all to a small summary.
>> If we ever migrate to Git, I wonder whether a similar script in Git is
>> able to cope with it.
> 
> We might also want to consider complicating the website-building process in 
> order to simplify the repository. Yes, "disk space is cheap" but it's kind of 
> ridiculous that we have all that derivative content in RCS, separate from its 
> canonical source.
> 
> -chris
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Discouraging Rogue Users In Tomcat

2020-08-05 Thread Dave Fisher
Hi -

In my experience the scans you are reporting may be from a white hat security 
scan of your website that is contracted by your security team. These tend to 
try every exploit that is known for any web server to make sure that your web 
apps is secure.

I’m not sure how the Tomcat team will respond to your ideas. But to me this 
seems like a use case for a filter and/or a reason to put httpd in front of 
Tomcat.

All the best,
Dave

> On Aug 5, 2020, at 10:17 AM, Alan Basche  wrote:
> 
>> 
>> Alan,
>> 
>> 
>> What kind of protections does this module provide? How does it
>> integrate into Tomcat (e.g. custom
>> Filter/Valve/ServletContextListener, patches to arbitrary places in
>> Tomcat internals, etc.)?
>> 
> 
> The point of this code is to prevent malicious users from probing
> Tomcat hosted apps for weaknesses that can be exploited.  After
> deploying Tomcat a year ago, I found that some automated programs were
> requesting hundreds of files that were not found on my system.
> Besides the security risk of revealing your system's configuration, a
> single attack could increase my daily access log file size by 3 or 4
> times with all of the file requests.  I was not able to find any way
> to discourage this activity with any Tomcat feature.  So I started
> developing my own solutions to shutdown these visitors.  Initially my
> efforts focused on manipulating the website apps' responses in various
> ways but ultimately, the most determined black-hats were not deterred.
> A few months ago I found a solution that worked quite well.  I
> concluded that Tomcat should simply refuse to connect to bad IP
> addresses.  Of course, Tomcat has no way of knowing when the server is
> being probed for vulnerabilities.  However, the website apps are fully
> aware of an attack (e.g. when a user asks for 'wp-login.php', but the
> apps don't even use php technology).  So the apps can determine if an
> attack is in-progress, and inform Tomcat of the attack so it can be
> dealt with.
> 
> I added a new class IpHitTable.java to Tomcat that manages a list of
> bad IP addresses, checks for an IP address in the table, has a public
> method that allows apps to add a problem IP address, and can return an
> HTML-formatted String for use in a website page to view the bad IP
> address table.  I modified Tomcat class NioEndpoint.java to call a
> method in IpHitTable to see if an IP address that wants to connect
> should be allowed.  This bad IP address table in IpHitTable is built
> as the system runs and is not stored/loaded to/from a disk or config
> file.  The table is empty each time Tomcat is started.
> 
> When a web app gets a bad request, it tells Tomcat the IP address and
> how many bad hits are permissible before future connections should be
> refused, and then sends a response with a status code that causes
> Tomcat to disconnect the session
> (org.apache.coyote.http11.statusDropsConnection()) if Tomcat informed
> the app that the bad hits limit has been reached.  My own web apps
> allow 3 bad requests before breaking a connection (to allow for
> legitimate file-not-found scenarios... the hostmaster removed/renamed
> HTML files for example).  I also have a default web app (defined in
> server.xml, the 'Engine' definition parameter) that receives requests
> that are not bound for a particular app.  This covers black-hats who
> come to my server by IP address and didn't even know it was a web
> server.  The default web app does not allow 3 bad requests, but rather
> immediately disconnects and tells Tomcat to immediately refuse future
> connections.  This type of request represents the vast majority of the
> black-hat probe attempts.
> 
>> 
>> Are you willing to post your code somewhere like GitHub where everyone
>> can see it?
>> 
>> - -chris
> 
> I have posted the 2 described classes at:
> 
> https://github.com/alannotallan/tc-code-01
> 
> Some points regarding the code & design:
> 
> - Look for *Alan* in NioEndpoint.java to see the bit of code I added.
> 
> - I built at least 8 proof-of-concept systems before I came up with
> this design.  They were meant to prove an idea, not implement a final
> version.  Therefore, some changes are to be expected.
> 
> - I didn't include the default web app since my app heavily uses a
> private library.  I would create a bare-bones default web app to
> handle requests in the manner described.
> 
> - I would add a valve to turn on/off this feature and set parameters as 
> needed.
> 
> - The IP address list is an ArrayList, which should probably be
> changed to some kind of hash list for performance reasons... although
> it would have to support listing every object for building the HTML
> table output.
> 
> - At the moment I only support the NIO connection protocol.  I can
> easily add NIO2 as well, but I have never been able to build an
> APR/native connection system after weeks of trying, so I didn't look
> into adding that code and would need help testing it.
> 
> Alan
> 
>