Re: [fpc-pascal] FPC websites blocked

2026-04-19 Thread Sven Barth via fpc-pascal
Wayne Sherman via fpc-pascal  schrieb am
Do., 16. Apr. 2026, 17:35:

> The FPC wiki website is also blocked from archive.org.  Note the
> latest capture for this page:
>
> https://web.archive.org/web/20250909050155/https://wiki.freepascal.org/Release_engineering


That is a side effect of Anubis and could be adjusted by whitelisting the
crawler from the Archive.

Are these pages also blocked from Google search and other search
> engines?  I am unable to check that but someone with write access to
> the web server can check and monitor the status of Google's web
> crawler indexes.  See here:
> https://support.google.com/webmasters/answer/9012289?hl=en


Ordinary search engine crawlers are already whitelisted by Anubis.

Question:
> "What will the long term consequences be of blocking any of the Free
> Pascal (fpc) websites (i.e. the main website, the forum, and the wiki)
> from AI / LLM tools, archive.org, and web search engines? How will
> users and potential new users of Free Pascal be affected? Summarize in
> one paragraph."
>

It's not relevant what a Stochastic Parrot thinks about this. What is
important is the reality: if we're *not* blocking the Ai crawlers then the
Wiki will be down for *everyone*. Before we added Anubis as a protection
the wiki was often not accessible by anyone due to the high amount of
traffic added by Ai crawlers. You can see this with the forum which
regularly receives high amounts of traffic due to crawlers and then isn't
reachable until one of the admins blocks a wide range of IPs used by those
crawlers.
And no, a CDN like Cloudflare is not a solution we are willing to use
because that is it's own can of worms.

Regards,
Sven

>
___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Nikolay Nikolov via fpc-pascal


On 4/16/26 6:34 PM, Wayne Sherman via fpc-pascal wrote:

I was doing some research via ChatGPT and asked the following:

"In relation to the conversation above, what is described here?:
https://wiki.freepascal.org/Release_engineering";

The response was not worth reading because ChatGPT could not access
the page.  Note this part of the response:

"One limitation: the wiki page itself was blocked by the site’s access
protection in this browsing session, so I could not read the full body
directly. My description is based on the search snippet, category
links, talk page, and related references."

The FPC wiki website is also blocked from archive.org.  Note the
latest capture for this page:
https://web.archive.org/web/20250909050155/https://wiki.freepascal.org/Release_engineering

Are these pages also blocked from Google search and other search
engines?  I am unable to check that but someone with write access to
the web server can check and monitor the status of Google's web
crawler indexes.  See here:
https://support.google.com/webmasters/answer/9012289?hl=en

Question:
"What will the long term consequences be of blocking any of the Free
Pascal (fpc) websites (i.e. the main website, the forum, and the wiki)
from AI / LLM tools, archive.org, and web search engines? How will
users and potential new users of Free Pascal be affected? Summarize in
one paragraph."

Response:
https://chatgpt.com/share/69e10026-f080-83e8-99bb-c48001698c87


Since you're quoting ChatGPT here, I'll quote an excerpt from ChatGPT's 
Terms of Use ( https://openai.com/policies/row-terms-of-use/ ):


"Accuracy. Artificial intelligence and machine learning are rapidly 
evolving fields of study. We are constantly working to improve our 
Services to make them more accurate, reliable, safe, and beneficial. 
Given the probabilistic nature of machine learning, use of our Services 
may, in some situations, result in Output that does not accurately 
reflect real people, places, or facts.


When you use our Services you understand and agree:

    Output may not always be accurate. You should not rely on Output 
from our Services as a sole source of truth or factual information, or 
as a substitute for professional advice.
    You must evaluate Output for accuracy and appropriateness for your 
use case, including using human review as appropriate, before using or 
sharing Output from the Services.
    You must not use any Output relating to a person for any purpose 
that could have a legal or material impact on that person, such as 
making credit, educational, employment, housing, insurance, legal, 
medical, or other important decisions about them.
    Our Services may provide incomplete, incorrect, or offensive Output 
that does not represent OpenAI’s views. If Output references any third 
party products or services, it doesn’t mean the third party endorses or 
is affiliated with OpenAI."



Best regards,

Nikolay

___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Wayne Sherman via fpc-pascal
Some testing results follow.
Wiki Testing page: https://wiki.freepascal.org/

blocked at archive.org:
https://web.archive.org/web/20260307043305/https://wiki.freepascal.org/

Query:  "What is the latest version of this web page that you can
access?: https://wiki.freepascal.org/ Summarize the web page as it is
currently presented to you in 3 sentences or less."

blocked at claud.ai:
Unfortunately, I wasn't able to access the actual content of the Free
Pascal Wiki. The site has deployed Anubis, a bot-protection system
that requires JavaScript-based proof-of-work verification to proceed —
a measure put in place specifically to guard against AI scrapers that
have been aggressively crawling websites and causing server downtime.
Since the challenge requires a browser with JavaScript enabled, I'm
unable to get past it to view the wiki's actual content

blocked at chatgpt:
The latest version of that URL I can currently access is not the Free
Pascal Wiki itself, but an “Access Denied” page served by Anubis. It
shows a sad Anubis image, an access-denied error code, a “Try again”
prompt, and a note that the site is protected by Anubis from Techaro.
So, as currently presented to me, I cannot see the wiki’s actual
content or front page text—only the access-block page.

available to google gemini (https://wiki.freepascal.org/CudaText):
The latest version of the web page that I can access was last edited
on April 5, 2026, at 16:50 (revision oldid=163215).
Summary of the web page: CudaText is a cross-platform, open-source
text and code editor written in Object Pascal using the Lazarus IDE,
designed for high performance and extensibility. It features syntax
highlighting for over 300 languages, multi-carets, and a JSON-based
configuration system that allows for deep customization. The editor
supports a wide range of operating systems and can be further enhanced
through Python-based plugins, such as LSP clients and project
managers.

Forum Testing:
blocked at archive.org: Can get to the forums main page, but not the
posts or topics:  The Wayback Machine has not archived that URL.
https://web.archive.org/web/20260309161728/https://forum.lazarus.freepascal.org/

Query:  "What is the latest version of this web page that you can access?:
https://forum.lazarus.freepascal.org/index.php/topic,73900.0.html
Summarize the web page as it is currently presented to you in 3
sentences or less."

available to claud.ai:  I was able to access the page as of today,
April 16, 2026. Here's a summary:
This is a Lazarus/Free Pascal forum thread titled "Delimited text -
how find last index,"...

blocked at ChatGPT #1:  "I’m checking the page directly and, if the
forum blocks access, I’ll report the most recent version of the page I
can actually retrieve."
blocked at ChatGPT #2: "The latest version I could actually access was
not the live board page itself—the direct fetch of board,21.0.html
timed out—but I could access a search-indexed snapshot identifying it
as the “General” board on the Lazarus forum. The direct fetch timed
out, so I’m checking whether the forum is serving a block/interstitial
page instead of the thread itself. The latest version I can access of
that URL is not the thread itself: when I try to open the page
directly, it times out, so I cannot see the post contents or summarize
the actual discussion..."

blocked at Google gemini #1: I'm having a hard time fulfilling your
request. Can I help you with something else instead?
blocked at Google gemini #2: The latest version of the page I can
access is the current live version as of April 16, 2026.
"Summary:  This forum thread announces the official release of Lazarus
3.4, detailing its major bug fixes..."
(MY NOTE: this summary is incorrect for that URL)
___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Guillermo Martínez Jiménez via fpc-pascal
IIRC somebody suggested to block all LLM crawlers to prevent DDOS-like
problems.

BTW I use Startpage for web searching and it seems to have access to
all Free Pascal documents, wiki and forums and DuckDuckGo seems to have
litle problems too so I don't see the problem.  Just stop using ChatGPT
to do your work (you shouldn't use it to do your work anyway, I'm
pretty sure you can do it way better than any LLM out there).

Regards,
Guillermo "Ñuño" Martínez.

El Thu, 16 Apr 2026 08:34:41 -0700
Wayne Sherman via fpc-pascal  escribió:
> I was doing some research via ChatGPT and asked the following:
> 
> "In relation to the conversation above, what is described here?:
> https://wiki.freepascal.org/Release_engineering";
> 
> The response was not worth reading because ChatGPT could not access
> the page.  Note this part of the response:
> 
> "One limitation: the wiki page itself was blocked by the site’s access
> protection in this browsing session, so I could not read the full body
> directly. My description is based on the search snippet, category
> links, talk page, and related references."
> 
> The FPC wiki website is also blocked from archive.org.  Note the
> latest capture for this page:
> https://web.archive.org/web/20250909050155/https://wiki.freepascal.org/Release_engineering
> 
> Are these pages also blocked from Google search and other search
> engines?  I am unable to check that but someone with write access to
> the web server can check and monitor the status of Google's web
> crawler indexes.  See here:
> https://support.google.com/webmasters/answer/9012289?hl=en
> 
> Question:
> "What will the long term consequences be of blocking any of the Free
> Pascal (fpc) websites (i.e. the main website, the forum, and the wiki)
> from AI / LLM tools, archive.org, and web search engines? How will
> users and potential new users of Free Pascal be affected? Summarize in
> one paragraph."
> 
> Response:
> https://chatgpt.com/share/69e10026-f080-83e8-99bb-c48001698c87
> ___
> fpc-pascal maillist  -  [email protected]
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Wayne Sherman via fpc-pascal
On Thu, Apr 16, 2026 at 8:52 AM Tomas Hajny via fpc-pascal
 wrote:
> And BTW (sorry to replying to my own message :-( ) - checking whether
> search engines are blocked from the Wiki is obviously much easier than
> what was suggested in the original post. Searching for "FPC release
> engineering" using both DuckDuckGo and Google resulted in finding the
> above mentioned page as the first result

Unfortunately checking web search indexes is not that simple.  That
wiki page has been there since 2015.  Do you know the last time it was
indexed?  (two ways to check:  use Google's URL Inspection tool, and
analyze web server logs)

For some searches the search engine will provide the date of the page.
Google search dates the page from 2025 which is the last time it was
changed:
Release engineering - Lazarus wiki
Free Pascal wiki
https://wiki.freepascal.org › Release_engineering
Jun 20, 2025 — This article discusses how to make a release of the FPC
compiler on various platforms. Contents. 1 Migration of existing SVN
checkout; 2 Notes ...

Searching for "CudaText" gives a fresh result which means Google
Search is still indexing:
CudaText - Lazarus wiki
Free Pascal wiki
https://wiki.freepascal.org › CudaText
Apr 5, 2026 — CudaText is a cross-platform text editor, written in
Object Pascal language using the Lazarus IDE, with a focus on
performance and a broad featureset.CudaText - Lazarus wiki

DuckDuckGo doesn't provide dates, but they primarily use Bing's search
index.  Bing is also up-to-date:
Free Pascal wiki
https://wiki.freepascal.org › CudaText
CudaText - Lazarus wiki - Free Pascal
Apr 5, 2026 · CudaText is a text editor written in Object Pascal using
the Lazarus IDE, with features such as syntax highlighting, code
folding, auto-completion, and …
___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Tomas Hajny via fpc-pascal

On 2026-04-16 17:45, Tomas Hajny wrote:

On 2026-04-16 17:34, Wayne Sherman via fpc-pascal wrote:

I was doing some research via ChatGPT and asked the following:

"In relation to the conversation above, what is described here?:
https://wiki.freepascal.org/Release_engineering";

The response was not worth reading because ChatGPT could not access
the page.  Note this part of the response:

 .
 .

There are others who can probably provide better answers than me, but I 
believe that your question assuming blocking from all web search 
engines and also assuming blocking for all FPC related resources, not 
just the Wiki, is very misleading. What I know for sure is that we had 
problems with DDoS atacks / overloading of the Wiki and that the 
implemented blocking protects the resources from these overloads. You 
may ask ChatGPT whether it's better to ensure availability of the Wiki 
for users, or whether providing access to LLM tools is more important 
than providing access to users - or you may better try answering that 
question yourself, because asking a LLM whether LLM access is more 
important than access for people doesn't sound sane to me.


And BTW (sorry to replying to my own message :-( ) - checking whether 
search engines are blocked from the Wiki is obviously much easier than 
what was suggested in the original post. Searching for "FPC release 
engineering" using both DuckDuckGo and Google resulted in finding the 
above mentioned page as the first result (and, also BTW, found further 
references in the FPC forum)...


Tomas
___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal


Re: [fpc-pascal] FPC websites blocked

2026-04-16 Thread Tomas Hajny via fpc-pascal

On 2026-04-16 17:34, Wayne Sherman via fpc-pascal wrote:

I was doing some research via ChatGPT and asked the following:

"In relation to the conversation above, what is described here?:
https://wiki.freepascal.org/Release_engineering";

The response was not worth reading because ChatGPT could not access
the page.  Note this part of the response:

 .
 .

There are others who can probably provide better answers than me, but I 
believe that your question assuming blocking from all web search engines 
and also assuming blocking for all FPC related resources, not just the 
Wiki, is very misleading. What I know for sure is that we had problems 
with DDoS atacks / overloading of the Wiki and that the implemented 
blocking protects the resources from these overloads. You may ask 
ChatGPT whether it's better to ensure availability of the Wiki for 
users, or whether providing access to LLM tools is more important than 
providing access to users - or you may better try answering that 
question yourself, because asking a LLM whether LLM access is more 
important than access for people doesn't sound sane to me.


Tomas
___
fpc-pascal maillist  -  [email protected]
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal