alamb opened a new issue, #23258:
URL: https://github.com/apache/datafusion/issues/23258

   ### Is your feature request related to a problem or challenge?
   
   As software development in general becomes more and more agent driven / 
search driven, it is important to make sure that the content on the DataFusion 
website is part of the content used to make those development decisions
   
   If the datafusion website is invisible to agents then we won't show up when 
people ask said agents to help them build tools, etc
   
   There are a few things that the https://datafusion.apache.org is clearly 
missing
   
   1. `/robots.txt` with clear crawl rules (basically should crawl everything) 
-- for example from duckdb: https://duckdb.org/robots.txt
   2.  /sitemap.xml listing canonical URLs, keep it updated on publish - for 
example from duckdb: https://duckdb.org/sitemap.xml 
   
   
   
   
   There is a bunch more stuff from 
https://isitagentready.com/datafusion.apache.org but I think robots.txt
   
   ### Describe the solution you'd like
   
   Add robots.txt and sitemap.xml
   Ideally using one PR for each feature
   
   The sitemap.xml should be auto generated as part of the sphinx build process
   
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to